Parameter: a single learned number in a model : Learn

A parameter is one learned number, a weight, inside a model. Training is the process of setting billions of these numbers so the model produces useful output. The headline size of a model, like 7B, is its parameter count: 7B means seven billion parameters. More parameters can mean more capability, and always means more numbers to store and stream, so it drives how much memory the model needs.

What is a parameter?

A parameter is a single number inside the model: one weight. A model is, at heart, an enormous pile of these numbers arranged in layers. Training is the slow process of nudging every one of them until the whole pile produces useful output. Once training is done the numbers are frozen. Running the model, inference, just reads and multiplies them; it does not change them.

When you see a model named 7B, 35B, or 120B, that letter B is billion and the number is the parameter count. A 7B model carries seven billion learned numbers. The count is the single most quoted fact about a model because it is a quick proxy for two things at once: roughly how much the model could have learned, and roughly how much memory it will demand.

Why does the count drive your hardware?

Every parameter is a number that has to be stored, and during generation streamed through the chip for each token. More parameters means more memory to hold the weights and more data to move per token. That is why parameter count, together with the precision each number is stored at, sets the floor on the memory a model needs, and why quantization, storing each number in fewer bits, is the usual lever to make a big model fit.

One caveat keeps the count honest. For a mixture-of-experts model, only a fraction of the parameters are active on any given token, so a large total can run far cheaper per token than a dense model of the same headline size. The count tells you the storage bill. It does not, on its own, tell you the speed.

Parameter: a single learned number in a model

At a glance

What is a parameter?

Why does the count drive your hardware?

What the parameter count tells you

What it does not tell you

Related terms

At a glance

What is a parameter?

Why does the count drive your hardware?

What the parameter count tells you

What it does not tell you

Related terms

Go deeper