← All articles

#comparison

11 articles

All articles tagged "comparison" : self-hosted AI fixes, setups, and architecture notes.

Cloud vs Local AI: Where Each Actually Wins in 2026

Cloud vs Local AI: Where Each Actually Wins in 2026

An honest capability matrix between cloud Claude and a self-hosted GB10 stack across 13 tasks, plus the entry-points into the deeper-dive articles. Claude still leads on multi-step reasoning; the local stack now covers two things Claude cannot do at all.

Read article →
Amortized hardware, power-by-jurisdiction, opportunity cost, and the value of privacy, modelled at 10/100/1000/10000 calls per day. Break-even sits between 700 and 1,200 calls per day depending on the cloud tier you actually need, but the inputs that move the line are not the ones the listicles emphasize.
funneldgx-spark

Self-Hosted AI vs Cloud APIs: The Real Total Cost

Amortized hardware, power-by-jurisdiction, opportunity cost, and the value of privacy, modelled at 10/100/1000/10000 calls per day. Break-even sits between 700 and 1,200 calls per day depending on the cloud tier you actually need, but the inputs that move the line are not the ones the listicles emphasize.

Three honest paths at €15k for the one-person consultancy or small studio that has outgrown a single box: dual RTX 5090 on a Threadripper Pro workstation, DGX Spark plus a dedicated inference second box, or a refurbished pro-workstation route. Current Geizhals prices, UPS sizing, and the cases where this tier is genuinely the floor.
hardwaredgx-sparkservicesbudget-build

What I'd Buy in 2026 for €15,000: A Pro-Studio Sovereign AI Build

Three honest paths at €15k for the one-person consultancy or small studio that has outgrown a single box: dual RTX 5090 on a Threadripper Pro workstation, DGX Spark plus a dedicated inference second box, or a refurbished pro-workstation route. Current Geizhals prices, UPS sizing, and the cases where this tier is genuinely the floor.

A used RTX 3090 plus a current AM5 platform gets you a real local-inference box for under €2k in 2026. Component picks with current Geizhals prices, honest power-cost math for Germany, the US, and India, and a list of models this build runs well and the ones it does not.
affiliatehardwarebudget-build

What I'd Buy in 2026 for €2,000: A Beginner Sovereign AI Build

A used RTX 3090 plus a current AM5 platform gets you a real local-inference box for under €2k in 2026. Component picks with current Geizhals prices, honest power-cost math for Germany, the US, and India, and a list of models this build runs well and the ones it does not.

Two honest €4k paths: a new RTX 4090 24 GB on AM5, or a used RTX A6000 48 GB on a Threadripper-class platform. Component picks with current Geizhals prices, the workload that breaks each path, and a side-by-side with DGX Spark at the same money.
affiliatehardwarebudget-build

What I'd Buy in 2026 for €4,000: A Mid-Tier Sovereign AI Build

Two honest €4k paths: a new RTX 4090 24 GB on AM5, or a used RTX A6000 48 GB on a Threadripper-class platform. Component picks with current Geizhals prices, the workload that breaks each path, and a side-by-side with DGX Spark at the same money.

At €8k the binding question stops being VRAM ceiling and becomes architecture choice. A DGX Spark plus accessories on one side, an RTX 5090 32 GB workstation on the other. I run the Spark; here is the comparison from the inside, with current Geizhals prices captured 2026-05-22.
affiliatehardwaredgx-sparkbudget-build

What I'd Buy in 2026 for €8,000: A Premium Sovereign AI Build

At €8k the binding question stops being VRAM ceiling and becomes architecture choice. A DGX Spark plus accessories on one side, an RTX 5090 32 GB workstation on the other. I run the Spark; here is the comparison from the inside, with current Geizhals prices captured 2026-05-22.

The Spark wins on MoE-class language models and the developer-tooling pipeline. The Mac Studio wins on silence, daily-driver ergonomics, and memory ceiling (up to 512 GB on M3 Ultra). The choice depends on which column is binding for your workload.
dgx-sparkhardware

DGX Spark vs Apple Mac Studio: Which Wins for Local LLMs?

The Spark wins on MoE-class language models and the developer-tooling pipeline. The Mac Studio wins on silence, daily-driver ergonomics, and memory ceiling (up to 512 GB on M3 Ultra). The choice depends on which column is binding for your workload.

Three production-class open-weights models, all weighed against one Spark. Qwen wins on coding throughput and now sustains 57 to 62 tok/s under DFlash. Mistral holds the creative-prose and verified-vision slot as a safer fallback. GLM-5.1 at 754B does not fit and the reason it does not fit is the most useful lesson in this comparison.
qwenmistraldgx-spark

Mistral Small 4 vs Qwen 3.6 vs GLM-5.1 on a Single DGX Spark

Three production-class open-weights models, all weighed against one Spark. Qwen wins on coding throughput and now sustains 57 to 62 tok/s under DFlash. Mistral holds the creative-prose and verified-vision slot as a safer fallback. GLM-5.1 at 754B does not fit and the reason it does not fit is the most useful lesson in this comparison.

NVIDIA's published reference playbooks are excellent for the workflows they cover and quietly misleading for the workflows they do not. Three categories of help, three categories of trap, and the rule for telling them apart before you copy a configuration into production.
hardwareauthority

NVIDIA Playbooks: Where They Help and Where They Don't

NVIDIA's published reference playbooks are excellent for the workflows they cover and quietly misleading for the workflows they do not. Three categories of help, three categories of trap, and the rule for telling them apart before you copy a configuration into production.

Four assistants still on the table in 2026 plus one I uninstalled. Claude Code wins on raw capability, Aider wins on git discipline, opencode is now the local primary against Qwen 3.6, OpenClaw stays as the Mistral specialty. Vibe is in the postmortem column.
opencodeopenclaw

Coding Assistants on a Sovereign Stack: Claude Code, opencode, Aider, OpenClaw (and why Vibe got retired)

Four assistants still on the table in 2026 plus one I uninstalled. Claude Code wins on raw capability, Aider wins on git discipline, opencode is now the local primary against Qwen 3.6, OpenClaw stays as the Mistral specialty. Vibe is in the postmortem column.

Tailscale is the right pick if your sovereignty budget is finite and the rented coordination server is an acceptable trade. Headscale is the right pick if the coordination server's vendor risk is the dimension you cannot accept. Both ship the same WireGuard underneath.
ops

Tailscale vs Headscale for Multi-Box Sovereign Stacks

Tailscale is the right pick if your sovereignty budget is finite and the rented coordination server is an acceptable trade. Headscale is the right pick if the coordination server's vendor risk is the dimension you cannot accept. Both ship the same WireGuard underneath.