Mixed precision: more than one number format in one run : Learn

Mixed precision is the practice of using more than one numeric format inside a single model or run: most of the work happens in a small, fast format, while the parts that need accuracy stay in a larger one. The result is faster, lighter computation that keeps the precision where it actually matters.

At a glance

What it is

Using more than one number format in one model or run

Why do it

Speed and memory savings without losing accuracy where it counts

The trade

Small format for the bulk, larger format for the sensitive parts

Related but distinct

Quantization shrinks weights; mixed precision spans formats per part

What does mixed precision mean?

A model does an enormous amount of arithmetic, and not every part of it needs the same accuracy. Mixed precision uses that fact. Most of the work runs in a small numeric format, which is faster to move through memory and quicker to multiply. The parts that are sensitive to rounding, where small errors would pile up and push the model off course, stay in a larger, more accurate format.

The word precision here means how many bits a number gets, and so how finely it can represent a value. A smaller format saves memory and time but rounds harder. Mixed precision is simply the decision not to use one format for everything, but to spend the accuracy where it earns its keep.

How is it different from quantization?

The two get confused because both involve smaller number formats. Quantization shrinks a model’s stored weights into a compact, low-precision encoding so the whole thing takes less space. Mixed precision is about the running computation: different parts of the same run use different formats at the same time.

You can use both. A quantized model can still run with mixed precision during inference. The practical rule is the same either way: smaller formats are faster and lighter, but they round more, so you keep the larger format where the model is fragile and measure the output rather than assume it held up.

Mixed precision: more than one number format in one run

At a glance

One run, two precisions

What does mixed precision mean?

How is it different from quantization?

Mixed precision buys you

It will not

Related terms

At a glance

One run, two precisions

What does mixed precision mean?

How is it different from quantization?

Mixed precision buys you

It will not

Related terms

Go deeper