Token: the unit a model reads and writes : Learn

A token is the unit of text a language model reads and writes: a short chunk that is often a whole word, sometimes a piece of one, sometimes a single character or punctuation mark. The model never sees letters or words directly; it sees a sequence of tokens, and everything it produces comes out one token at a time.

At a glance

What it is

The chunk of text a model processes, roughly a word or word-piece

Rough rule of thumb

In English, a token is a little under one word on average

Why it matters

Context length, speed, and cost are all counted in tokens, not words

How output is made

The model emits one token at a time, each based on all the tokens so far

What is a token?

A language model does not read letters or words. Before it sees anything, your text is run through a tokenizer that splits it into tokens: short chunks that are often a whole common word, sometimes a piece of a longer or rarer word, and sometimes a single character or a punctuation mark. The model works on that sequence of tokens, and when it answers, it produces tokens, which are turned back into text for you to read.

A loose rule of thumb for English is that a token is a little under one word on average. Code, other languages, and unusual words break that rule, so never treat token counts and word counts as the same number.

Why count tokens instead of words?

Almost everything you care about when running a model is measured in tokens. The context length, the maximum amount of text a model can hold at once, is a token count. Speed is reported as tokens per second. The key-value (KV) cache, the working memory of a request, grows with each token in the context. And hosted models bill per million tokens.

So when a prompt feels expensive, slow, or too long to fit, the honest unit to think in is tokens. Word count is a polite approximation. Token count is what the machine actually pays for.

Token: the unit a model reads and writes

At a glance

From your text to the model and back

What is a token?

Why count tokens instead of words?

Counted in tokens

Not the same as

Related terms

At a glance

From your text to the model and back

What is a token?

Why count tokens instead of words?

Counted in tokens

Not the same as

Related terms

Go deeper