How AI Reads You
Theory
A token is the smallest unit a language model sees. Tokenizers like BPE (byte-pair encoding) learn which character combinations show up together and merge them into reusable pieces.
- Your text arrives as raw characters.
- The tokenizer splits it into known pieces from its vocabulary, sometimes a whole word, sometimes a fragment.
- Each piece becomes a number.
- The model reads those numbers, predicts the next number, and a piece is decoded back into text.
- That repeats until the answer is done or the budget runs out.
Three practical effects for you:
- Cost. Pricing is usually "X cents per million input tokens" plus "Y cents per million output tokens." Long context plus long answers compound.
- Memory. Models have a fixed context window measured in tokens, not pages. You will meet this in the next lesson.
- Languages and code. English is dense. Portuguese, German, Japanese, and source code often need more tokens per equivalent meaning, which means more cost and faster exhaustion of the window.
The skill is to feel tokens, not count them: tighter prompts, focused material, output formats that don't waste space.