Learn

NVLink: the fast link between GPUs

NVLink is NVIDIA's high-speed interconnect between graphics processors (GPUs), letting cards exchange data far faster than the standard system bus. It matters when a model is split across several GPUs. A DGX Spark is a single chip with one shared memory pool, so there is no NVLink inside it.

At a glance

What it is
A direct high-speed link between two or more NVIDIA GPUs
Why it matters
Splitting one model across cards needs fast card-to-card data
On a DGX Spark
Not present; it is one chip with one shared memory pool
Where you meet it
Multi-GPU boxes, like a dual-card desktop build
Comparison

Where NVLink does and does not apply

Multi-GPU box
DGX Spark (single chip)
How many GPUs
Two or more separate cards
One processor
Card-to-card link
NVLink carries data between cards
Nothing to link; one shared pool
Where a split model lives
Across cards, joined by NVLink
All in the one unified pool

NVLink is a direct, high-speed connection between NVIDIA graphics processors (GPUs). The normal way two cards in one machine talk is over the system bus, which is comparatively slow. NVLink is a dedicated bridge that lets them exchange data far faster. That only matters when you have a reason to make two cards work as one: usually because a model is too big for a single card, so you split it across two and the pieces have to keep talking to each other every token.

Why does it barely apply to a DGX Spark?

A DGX Spark is a single chip. The GPU and the processor share one pool of memory, so there is nothing to bridge: no second card, no card-to-card link, no NVLink. The whole reason NVLink exists is to paper over the gap between separate cards, and a Spark has no gap to paper over.

So if you read a multi-GPU guide that leans on NVLink and try to map it onto a Spark, the advice will not land. The concept you want there is not NVLink but the shared unified pool. NVLink becomes relevant the moment you step up to a box with two or more discrete cards, and not a moment before.

NVLink helps with

  • Splitting one model across two or more cards (tensor parallelism)
  • Moving activations between cards without crawling over the system bus
  • Letting a model that does not fit one card fit across several

NVLink will not

  • Do anything inside a single-chip DGX Spark, which has no card-to-card link
  • Add memory capacity; it moves data, it does not store it
  • Make a model run if it does not fit across the cards combined

Related terms

← All terms Reviewed: June 2026