Learn

FP16: the older half-precision float

FP16 (16-bit Floating Point), also called half precision, is a number format that stores each value in 16 bits, spending more of them on precision and fewer on range than a full 32-bit float. It halves a model's memory against 32-bit and is a common native format on consumer accelerators, though its narrow range makes it prone to overflow on extreme values.

At a glance

What it is
A 16-bit floating-point format, 16-bit Floating Point
Also called
Half precision
Its trade
More precision than BF16, but a narrower value range
Its risk
Overflow or underflow on values outside its range

What is FP16?

FP16, the 16-bit Floating Point format, also called half precision, stores a number in 16 bits. Like any floating-point format it divides those bits between range and precision, and FP16 leans toward precision: within the values it can represent, it pins them down quite finely. The catch is the range. FP16 cannot hold values as large or as small as a full 32-bit float, so a calculation that strays past its limits overflows to infinity or collapses to zero.

The payoff is the same as any half-size format: a model in FP16 takes half the memory of the 32-bit version, which means more model and more context fit in the same space. On a lot of consumer hardware FP16 is a long-standing native format, so it is the half-precision you are most likely to meet on a desktop card.

When does the range problem bite?

It bites when values get extreme, which happens more in training than in plain inference, but it can surface anywhere a sum grows large or a gradient grows tiny. That weakness is exactly why BF16, the 16-bit Brain Floating Point format, was designed: it keeps the wide range of a 32-bit float and gives up some precision to fit in 16 bits, the mirror image of FP16’s choice. For serving large models BF16’s range usually wins, so newer stacks default to it. FP16 is still widely supported and perfectly usable, especially on hardware where it is the native path, but if you see overflow warnings on a numerically rough workload, the format’s narrow range is the first suspect.

FP16

  • More fraction bits, so more precision in its range
  • Narrower range, can overflow on large or tiny values
  • Long-standing native format on consumer accelerators
  • The older of the two 16-bit floats

BF16

  • Fewer fraction bits, so less precision per value
  • Wide range, the same span as a 32-bit float
  • Common default on modern data-center accelerators
  • Designed later, to dodge FP16's range problem

Related terms

← All terms Reviewed: June 2026