What I'd Buy in 2026 for €2,000: A Beginner Sovereign AI Build

May 24, 2026 10 min read

Update (2026-06-19). The “Qwen 3.6 PrismaQuant” references here predate the 2026-06-11 production switch to AutoRound int4-mixed (69.2 tok/s, 12.7 percent better on the coding gate, vision retained, PrismaQuant retired). The figures are kept as the engineering-log record; the live stack is on /stack/ and the switch is measured in AutoRound int4 vs PrismaQuant.

Here is what I would buy at €2,000 today, knowing what I know in 2026-05. A used RTX 3090 with 24 GB of VRAM, a current AM5 board, a Ryzen 7 7700, 64 GB of DDR5, a 1 TB NVMe drive, an 850 W Gold power supply, and a mid-tower with a mesh front. The total lands at roughly €1,750 to €2,050 depending on how patient you are with the used card market on Kleinanzeigen.

This is the entry build for someone who wants a real local-inference box and is not yet sure whether the work justifies the price of a DGX Spark or a Mac Studio. It runs Llama 3.1 70B at Q4, Mistral Small Q5, Qwen 30B class models at usable interactive speed, and it is also a perfectly good desktop PC. I have not personally tested this exact combination, so I will treat the throughput numbers below as the spec-sheet expectation rather than the bench result. The component picks are conservative on purpose.

The build at a glance

Component	Pick	Price	Source
GPU	Used RTX 3090 24 GB	€600 to €850	kleinanzeigen.de RTX 3090 listings
CPU	AMD Ryzen 7 7700 (boxed)	€239.89	geizhals.de Ryzen 7 7700 boxed
Mainboard	ASUS TUF Gaming B650-Plus WIFI	€146.73	geizhals.de ASUS TUF B650-Plus
RAM	Crucial Pro 64 GB DDR5-5600 Kit	€630.20	geizhals.de Crucial Pro 64GB Kit
NVMe	Kingston NV3 1 TB PCIe 4.0	€132.90	geizhals.de Kingston NV3 1TB
PSU	MSI MAG A850GL 850 W ATX 3.1 Gold	€84.99	geizhals.de MSI MAG A850GL
Case	Fractal Design Meshify C Dark	€74.67	geizhals.de Meshify C Dark
Total (low end of GPU range)		€1,909.38
Total (high end of GPU range)		€2,159.38

Prices captured 2026-05-22 from Geizhals.de and Kleinanzeigen.de. They will drift. Re-verify before you buy.

The used GPU is the price-mover. Patient buyers find clean 3090s on Kleinanzeigen for €600 to €700; impatient buyers pay €800 to €900 and skip the train rides. Watch for cards that come with the original box and receipt; that is the cleanest case for the warranty conversation if a fan dies in month four.

Why each pick

Used RTX 3090, 24 GB. The 3090 is the architecture-correct beginner card for local inference in 2026 because nothing newer at this price band gives you 24 GB of usable VRAM. The 4070 Ti Super is faster on dense workloads but caps at 16 GB. The 4080 Super is faster still but also at 16 GB. For a 70B model at Q4 quantization, the math wants 24 GB or more in a single card. The 3090 is the cheapest card that satisfies the math. Its weakness is power draw under load, which the 850 W PSU is sized for. It does not run NVFP4 quantization; the NVFP4 path requires Blackwell.

Ryzen 7 7700. Eight cores, 16 threads, AM5 socket, 65 W TDP, boxed cooler included. The 7700 is the value pick over the 7800X3D for an inference workstation because the 3D V-cache that makes the X3D faster in games does not help inference, where the GPU does all the heavy lifting. You save €100 to €240 and lose roughly zero tokens per second. Spend the saved money on RAM or a better GPU.

ASUS TUF Gaming B650-Plus WIFI. A B650 board at the €120 to €160 tier is the sweet spot for this build. You get PCIe 5.0 to the GPU slot, two M.2 slots, decent VRMs for the 7700 plus future AM5 upgrade path, and integrated WiFi 6E. The TUF line has been one of the lower-RMA-rate budget boards over the last two years. Skip X670E for this tier; the cost delta does not buy anything you will use.

64 GB DDR5-5600. Local inference does not need extreme RAM speed because the GPU’s VRAM is the relevant bandwidth. What you do need is enough system RAM to hold the model’s CPU-offloaded layers, a generous Linux page cache, and your editor plus browser. 64 GB is the threshold below which you will hit swap during multi-model workflows; 32 GB will technically work but you will resent it within six weeks. The Crucial Pro kit is a CAS 46 part at €630, which is the cheapest 64 GB DDR5 currently listed on Geizhals from a major brand. RAM is the second-most-expensive line on this build after the GPU and it is the line readers most often underspec.

Kingston NV3 1 TB PCIe 4.0. Models live on disk. A single Llama 3.1 70B Q4 GGUF is roughly 40 GB. A Qwen 30B Q5 is another 20 GB. The OS and the development environment claim the first 60 GB. You will fill a 1 TB drive faster than you expect, and the build supports adding a second drive when that day arrives. The NV3 is the value pick at €133; it is not the fastest 1 TB drive on the market, but the disk is not the inference bottleneck. The bottleneck is the VRAM bandwidth.

MSI MAG A850GL, 850 W, 80+ Gold. A 3090 alone pulls up to 350 W under load. The 7700 plus board plus drives add roughly 120 W headroom. The 850 W rating gives you the safety margin to survive a transient spike during a model load without the PSU dropping the system. ATX 3.1 means the GPU’s 12V-2x6 connector is native; no dongles, no fire risk stories. €85 is the floor for this class.

Fractal Design Meshify C Dark. Airflow is the entire reason this case exists. A 3090 dumps a lot of heat into a chassis and a mesh front lets the front fans actually pull air. The Meshify C fits the ATX motherboard, a triple-slot GPU, and three or four intake fans. The Dark variant is €4 cheaper than the standard model and aesthetically more honest about being an engineering tool rather than a showpiece.

What this runs, and what it does not

Runs well at interactive throughput: Llama 3.1 70B at Q4 quantization, Mistral Small 3.x at Q5, Qwen 3 30B-class models at Q6, GLM-class 32B at Q6, and most of the 7B to 13B model space at FP16. For the model-class trade-offs at this size, see Mistral Small 4 vs Qwen 3.6 vs GLM 5 on DGX Spark; the relative rankings translate, the absolute throughput does not.

Does not run well: Qwen 3.6 PrismaQuant at the 119B-parameter MoE class (the active-parameter footprint plus the routing table do not fit cleanly in 24 GB), Mistral Large dense at any usable quant, or anything labelled 100B+ dense. The Spark is the right machine for those workloads. The 2k build is the right machine for the model classes that fit in 24 GB. Honesty about the ceiling is the whole point of this article.

Runs but slowly: Image generation at SDXL and Flux Schnell works. Flux Dev at full resolution will be slow because the model spills into shared memory. Diffusion is bandwidth-bound and the 3090’s GDDR6X is competitive for its price tier but well below current-gen workstation cards. If image generation is your main workload, this is not the right build; a dual-3090 NVLink setup at the €4k tier is better aimed.

Monthly power cost, three jurisdictions

The 3090 inference-idle draw is roughly 50 W. Under continuous inference load it pulls 280 to 330 W. A realistic mixed-use profile (eight hours of active inference per day, sixteen hours of light idle) averages around 130 W to 160 W. I will use 150 W average as the centerline, which works out to 109 kWh per month.

Jurisdiction	€/kWh	Monthly cost at 109 kWh
Germany (household tariff)	€0.34	€37
United States (national avg)	€0.16 (≈$0.18)	€17
India (residential avg)	€0.07 (≈₹6.50)	€8

Germany numbers reference the Statista 2026 household composition and Verivox January-2026 new-customer averages. US numbers reference the EIA Electric Power Monthly (May 2026 national residential average). India numbers reference Desi Utility’s 2026 tariff comparison and vary hugely by state slab. Re-verify your own rate; the spread between Bavarian and Brandenburg tariffs alone is wider than the spread between two model quants.

The amortized hardware cost over three years is €54 to €60 per month at the build-total range. Power adds another €8 to €37. Total cost of operation is €60 to €100 per month, dramatically below cloud-API for the workloads that fit in 24 GB. The full cost model lives in Self-Hosted AI vs Cloud APIs: The Real Total Cost. At this scale the break-even versus a Claude or Anthropic API subscription is closer to a few hundred calls per day rather than the thousand-per-day threshold the Spark requires.

Compare to the other tiers

The €2k build is the entry point. If your work needs more than 24 GB of VRAM in one card or you want a second card for tensor parallelism, jump to the €4k mid-tier build. If you are running MoE language models in the 100B+ class as the daily workload, jump to the €8k premium build. If you are starting a one-person consulting practice and need parallel jobs plus real fine-tuning headroom, the €15k pro-studio build is the floor for that workload class.

There is also a case for buying nothing and renting cloud GPU time for six months while you measure your actual workload. That case is real and I make it to about a third of the prospective buyers who write to me. The math is in the cost-comparison article and the buyer-profile filter is in the Spark decision tree.

If I had it to do again

The two regrets in this class of build are: buying the GPU first when the GPU is the part with the most stable resale market, and underspecing the RAM. If I were assembling this fresh in 2026-05, I would buy the platform first (case, PSU, board, CPU, RAM, NVMe), spend a month running the system as a regular workstation with integrated graphics or whatever GPU I have lying around, and only then chase the used 3090 listings with a clean baseline to compare against. The 3090 market is patient-buyer’s-market in 2026; the platform parts are the urgent ones.

The other discipline I would impose is a written log of “what models did I actually run this week” for the first eight weeks. About half of the readers who write to me about this tier of build discover that their real workload is two or three specific models at one quantization, and the build choices collapse to that workload. The decision tree is shorter than the marketing implies.

Stack Audit

If you want a second pair of eyes on whether this build matches your actual workload, the Stack Audit is two hours, fixed-fee, ends with either a build recommendation or a “buy nothing yet, rent cloud for six months, here is what to measure” verdict. About a third of the audits at this tier end with the rent-first recommendation, which is the honest answer for buyers who have not yet measured their workload.

Contact via the footer. Or read the €4k version next if your workload is already past the 24 GB ceiling.

	Today	7d	30d	All-time
Unique readers	—	—	—	—
Page views	—	—	—	—