What I'd Buy in 2026 for €8,000: A Premium Sovereign AI Build

May 24, 2026 11 min read

Update (2026-06-19). The “Qwen 3.6 PrismaQuant” references here predate the 2026-06-11 production switch to AutoRound int4-mixed (69.2 tok/s, 12.7 percent better on the coding gate, vision retained, PrismaQuant retired). The figures are kept as the engineering-log record; the live stack is on /stack/ and the switch is measured in AutoRound int4 vs PrismaQuant.

Here is what I would buy at €8,000 today, knowing what I know in 2026-05. The honest answer at this tier splits in two: a DGX Spark plus the supporting hardware to operate it well, or an RTX 5090 32 GB workstation on a Threadripper-class platform. I run the Spark. I have been running it since early April 2026, I have crashed it and recovered it and shipped a one-person business off it, and I have published the receipts. So this article speaks from direct measurement on the Spark path and from spec-sheet analysis on the 5090 path. I will not present the 5090 numbers as if I had benchmarked them; I have not.

The €8k tier is where the architecture question stops being VRAM ceiling and becomes serving-stack choice. The Spark is the architecture-correct answer for MoE language models at the 100B+ class. The 5090 workstation is the architecture-correct answer for dense models that fit in 32 GB, plus image and video generation, plus the workloads where Blackwell consumer-card features (NVFP4 quantization, DLSS, raw FLOPS for diffusion) matter more than unified memory.

Path A: DGX Spark plus accessories

Component	Pick	Price	Source
Main machine	NVIDIA DGX Spark Founders Edition (128 GB unified, 4 TB SSD, GB10)	€4,769.00	geizhals.de DGX Spark Founders
UPS	APC Back-UPS Pro 1500 VA class	~€350 (estimate, verify before buying)	manufacturer direct, verify on apc.com
External backup NAS or disk	4 TB external NVMe enclosure plus drive	~€450 (estimate, verify before buying)	Geizhals 4 TB NVMe range
Daily-driver desktop or laptop	existing or used Linux ThinkPad	€0 to €800	secondhand market
Network gear (managed switch, decent router)	~€300 (estimate, verify before buying)	Geizhals network category
KVM, cables, rack shelf	~€200	various
Path A total (with €800 secondhand driver)		€6,869	under budget
Path A total (with €1,500 new desktop)		€7,569	under budget

The Spark is the centerline at €4,769. The rest of the budget at this tier goes into the operational ecosystem around it: a UPS sized for the 30-minute graceful-shutdown procedure described in Power Failure Recovery: DGX Spark in 30 Minutes, an external backup for the 119B model weights (the math for which lives in Backing Up 119B Parameters Without Bankruptcy), and a daily-driver machine because the Spark is a Linux server you operate over SSH rather than a desktop. The Spark categorically is not the box you sit in front of for eight hours a day; this point is the most-missed implication of the architecture.

Path B: RTX 5090 32 GB workstation

Component	Pick	Price	Source
GPU	Zotac GeForce RTX 5090 32 GB	€3,469.00	geizhals.eu Zotac RTX 5090
CPU	AMD Ryzen Threadripper PRO 7955WX 16C/32T	€1,633.47	geizhals.eu Threadripper PRO 7955WX
Mainboard	ASRock WRX90 WS EVO	€807.90	geizhals.de ASRock WRX90 WS EVO
RAM	4× Crucial Pro 64 GB DDR5-5600 (256 GB total, non-ECC)	€2,520.80	geizhals.de Crucial Pro 64GB Kit
NVMe	Samsung 9100 PRO 4 TB PCIe Gen5	€619.00	geizhals.de Samsung 9100 PRO 4TB
PSU	be quiet! Pure Power 12 M 850 W (or step to 1000 W)	€151.67 to €200	geizhals.de Pure Power 12 M 850W
Case	Fractal Design Meshify 2 XL	€183.03	geizhals.de Meshify 2 XL
Path B subtotal		€9,384.87	over budget

Path B is over the €8k envelope by €1,400 if you use ECC-capable Threadripper Pro components. To land at €8k, downgrade to a non-Pro Threadripper 7000 series (saves roughly €700, loses ECC and 8-channel RAM) or step the RAM to 128 GB (saves €1,260). I recommend the 128 GB RAM option because the 5090’s 32 GB VRAM is the binding constraint anyway; system RAM beyond 128 GB is not the limiting factor for the model classes this build runs. With 128 GB RAM the total lands at €8,124, on budget.

Prices captured 2026-05-22 from Geizhals.de and Geizhals.eu. They will drift. Re-verify before you buy.

Why the Spark path, from direct experience

I have written the long-form version of this argument in Should You Buy a DGX Spark in 2026?; short version below for the buyers at the €8k tier specifically.

The Spark wins for me because my workload is MoE language models in the 100B+ parameter range, served via vLLM and SGLang to a small consulting practice and a public MCP server. Specifically, I run Qwen 3.6 PrismaQuant at 4.75 bit and measure ~57-62 tokens per second sustained interactive throughput (with DFlash speculative decoding, verified 2026-05-22), documented in Spark Arena Rank 4 Made Me Add Qwen3.6. The 119B-parameter total / 17B-active model fits in 128 GB of unified memory with comfortable context headroom. A 5090’s 32 GB does not fit the same model class without aggressive quantization that hurts output quality.

The Spark also gives you the production-inference stack on the architecture NVIDIA’s vLLM and SGLang teams target first. New model releases on Hugging Face are working endpoints on the Spark within days. The same model on Apple Silicon waits for MLX to catch up, often weeks. For the comparison at the same price point against a Mac Studio, see DGX Spark vs Mac Studio for Local LLMs. For the operational quirks that decide first-month usability (the VLLM_FLASHINFER_MOE_BACKEND=latency flag and the drop_caches=3 discipline), see Five DGX Spark Disasters I Survived.

The Spark loses on three specifics. It is acoustically louder than a Mac Studio under sustained load. It is not a daily-driver workstation; you operate it over SSH. Its resale market in 2026 is genuinely unknown because the platform is too new to have a depreciation floor. If any of those three are binding constraints, Path B (5090 workstation) is the architecture-correct answer.

Why the 5090 path, from spec-sheet analysis

I have not tested a 5090. I would treat the numbers below as the spec-sheet expectation, not the bench result.

The 5090 is the right card for buyers whose workload is dense models that fit in 32 GB plus generative-image and generative-video pipelines plus the NVFP4 quantization path. The 32 GB VRAM is enough for Llama 3.1 70B at Q3 or aggressive Q4, Mistral Large dense at Q4, and most of the dense-model space below 100B. The card’s headline FLOPS and memory bandwidth advantage over the 4090 matters most on the FLOPS-bound diffusion workloads where the Spark is weakest. The Blackwell architecture also unlocks the NVFP4 format described in NVFP4 Quantization Explained; on the Spark that path is GB10-native, on a 5090 desktop it is the consumer-tier equivalent.

The 5090 loses on the workload that matters most to me: 100B+ MoE models do not fit in 32 GB without spilling into system RAM, and the per-token cost of that spill is high enough that you do not want to live there as a daily pattern. Buyers whose roadmap is dense will be fine. Buyers whose roadmap is MoE will be frustrated within three months. Be honest about the roadmap before you buy.

Side-by-side at €8k

Dimension	Spark path	5090 path
VRAM (or equivalent)	128 GB unified	32 GB GDDR7
Memory bandwidth	~273 GB/s	~1.8 TB/s
Best workload	MoE 100B+ language	dense ≤ 70B, image/video
119B MoE fit	native, ~71 tok/s receipt	spills system RAM
Image/video generation	weak	strong
Daily-driver desktop	no (Linux server)	yes
Quietness under load	moderate ramp	depends on case
Production inference stack	vLLM, SGLang, TensorRT-LLM first-class	vLLM, SGLang first-class
NVFP4 quantization	yes (GB10)	yes (Blackwell consumer)
Resale (24 months)	unknown	strong
Power draw under load	~150 to 200 W	up to 600 W (GPU alone)

The “memory bandwidth” row is the most-misread cell on this comparison. The 5090’s bandwidth is nearly 7× the Spark’s. On dense models that bandwidth wins. On MoE models with sparse expert activation, the per-token movement is small enough that the Spark’s unified architecture wins. Whether you are dense or MoE is the decision; the rest of the table follows from it.

What this runs, what it does not

Spark path runs well: Qwen 3.6 PrismaQuant at 4.75 bit (~57-62 tok/s with DFlash, measured 2026-05-22), Mistral 24B safer config at ~29 tok/s decode (measured, see Mistral Safer Launch notes), GLM-class MoE at usable interactive throughput, dense 70B at moderate throughput (10 to 20 tok/s range, bandwidth-bound). Does not run well: image generation at production resolution, video diffusion, anything that wants Blackwell consumer-card RT cores.

5090 path runs well: dense models that fit in 32 GB at Q4 or better, NVFP4 quantization on the supported model family, image generation at production resolution including Flux Dev and SDXL pipelines, real-time video generation experiments. Does not run well: 119B MoE at full quality, sustained inference serving without aggressive thermal management of the GPU, anything that wants the unified-memory architecture for routing-table residence.

Monthly power cost, three jurisdictions

The Spark idles around 40 W and pulls 150 W to 200 W under sustained inference. A realistic mixed-use profile averages around 100 W, which is 73 kWh per month. The 5090 path idles at 30 to 50 W and can pull 500 W or more under load; a mixed-use average lands around 220 W (160 kWh per month).

Jurisdiction	€/kWh	Spark (73 kWh)	5090 (160 kWh)
Germany	€0.34	€25	€54
United States (national avg)	€0.16	€12	€26
India	€0.07	€5	€11

Hardware amortization over three years is €191 (Spark path at ~€6,869) to €226 (5090 path at €8,124). Power adds €5 to €54. Total cost of operation is €200 to €280 per month, which puts the break-even versus cloud-API solidly above the 1,000-calls-per-day threshold the cost-comparison article lays out. At this tier you should not be self-hosting unless you have either a real privacy requirement or sustained sub-€0.0003-per-token economics.

Compare to the other tiers

Below this tier, the €4k mid-tier build gives you a 4090 or used A6000 path for workloads that fit in 24 to 48 GB. The €2k beginner build is the entry point for buyers who have not yet measured their workload. Above this tier, the €15k pro-studio build is the floor for two-card parallel jobs and serious fine-tuning at the consultancy-firm scale.

The case for the €8k tier specifically is: you are running production inference for a customer who pays you for the sovereignty, you are at or above 1,000 calls per day sustained, and your workload roadmap is either MoE language (Spark) or dense plus image (5090). If any of those three statements is false, the €4k tier is probably the architecture-correct answer for less money.

If I had it to do again

I bought the Spark in early April 2026 and the operational learning curve was steeper than the marketing implied. The two specific quirks that bit me first were the VLLM_FLASHINFER_MOE_BACKEND=latency requirement (the throughput backend froze my SM121A desktop until I switched) and the page-cache hijack on model swaps (the kernel keeps stale weights around and the next launch OOMs at 95 GB). Both are documented now in Five DGX Spark Disasters I Survived and the Power Failure Recovery procedure. If I were doing it again I would read those two articles before unboxing the Spark, and I would set up the systemd-managed launch path on day one rather than discovering the need for it after the third crash.

The other discipline I would impose is to measure my actual call volume for a month before justifying the €8k outlay. The self-hosted-vs-cloud cost model makes the break-even visible. Below 1,000 calls per day, you are paying for sovereignty rather than for unit economics. That is a defensible reason to buy, but you should be making it consciously rather than because the spec sheet looked appealing.

For the strategic framing of why an operator at this tier is buying the Spark at all, see The Quiet Pattern Among Sovereign Engineers. The pattern is repeatable and the financial case is honest.

Book a Stack Audit

If you want a second pair of eyes on whether the Spark path or the 5090 path matches your actual workload, the Stack Audit is two hours, fixed-fee, ends with a configuration recommendation and a power-cost projection for your jurisdiction. About a third of audits at this tier end with “buy the €4k build instead, you do not need this.” The honesty is the product.

Contact via the footer (Nostr or email). Or read the €15k version next if you are sizing for parallel jobs and real fine-tuning.

	Today	7d	30d	All-time
Unique readers	—	—	—	—
Page views	—	—	—	—