ChatGPT and other LLMs rely on input text being broken into pieces. Each piece is about a word-sized sequence of characters or smaller. We call those sub-word tokens. That process is called tokenization and is done using a tokenizer.
Tokens can be words or just chunks of characters. For example, the word “hamburger” gets broken up into the tokens “ham”, “bur” and “ger”, while a short and common word like “pear” is a single token. Many tokens start with whitespace, for example, “ hello” and “ bye”.
The models understand the statistical relationships between these tokens and excel at producing the next token in a sequence of tokens.
The number of tokens processed in a given API request depends on the length of both your inputs and outputs. As a rough rule of thumb, the 1
token is approximately 4
characters or 0.75
words for English text.