MXFP4: four-bit weights with a shared scale : Learn

MXFP4 (micro-scaled 4-bit floating point) is a way of storing model weights in four bits each, where every small block of weights shares a single separate scale factor. The shared scale lets a block stretch to fit its own range, recovering most of the accuracy a flat four-bit format throws away. It is an open industry standard, not a single vendor's format.

At a glance

What it is

A four-bit floating-point weight format with a per-block shared scale

Why the scale

Each block stretches to its own range, keeping more accuracy

Standard, not proprietary

An open format backed by several hardware vendors

The catch

Full speed needs kernels built for the exact GPU architecture

What is MXFP4?

MXFP4 stands for micro-scaled 4-bit floating point. Each weight is stored in just four bits, which is tiny, but the format adds one trick: every small block of weights, commonly thirty-two of them, shares a single separate scale factor. That scale lets each block stretch to fit its own range of values. A flat four-bit format has to cover everything with one range and clips the outliers; the per-block scale buys most of that lost accuracy back. The “MX” is the micro-scaling; the “FP4” is the four-bit float.

Why does MXFP4 matter on a small box?

Four-bit weights are how a large model fits into a modest memory budget, and MXFP4 makes those four bits accurate enough to be worth using. It is an open industry standard rather than one company’s invention, backed by several hardware makers, so models can be trained in it directly instead of being squeezed down afterward. The honest caveat is speed: a format being supported by the silicon is not the same as the fast code being compiled for your exact GPU. A model in MXFP4 can still crawl on a brand-new chip until the right kernels ship, which is a kernel problem, not a format problem.

MXFP4: four-bit weights with a shared scale

At a glance

Flat four-bit versus micro-scaled four-bit

What is MXFP4?

Why does MXFP4 matter on a small box?

MXFP4 helps with

MXFP4 will not fix

Related terms

At a glance

Flat four-bit versus micro-scaled four-bit

What is MXFP4?

Why does MXFP4 matter on a small box?

MXFP4 helps with

MXFP4 will not fix

Related terms

Go deeper