How Much Electricity Does Self-Hosted AI Actually Use? Lightbulbs, Bitcoin Miners, and Solar Panels

May 4, 2026 9 min read

New to self-hosting AI? The Self-Hosted AI: Start Here hub walks the hardware-decision tree, inference-engine choice, and the operational gotchas that bite hardest in the first three months. Read it before or after this one, whichever fits your stage.

There is a recurring question from people who hear about self-hosted AI for the first time: “doesn’t running an AI model 24/7 burn through electricity?” The answer is more interesting than yes-or-no. The same machine costs €5 a month in some US states and €25 a month in Germany, draws less than an old-style 60-watt bulb when idle, and would need fewer solar panels to offset than most people guess. Less than a Bitcoin miner by a factor of 22. More than a Raspberry Pi by a factor of 30. Worth understanding before deciding whether the stack makes sense for you.

This post is the noob-friendly excursion into the electricity side of self-hosted AI. Real measurements, real prices, real solar math.

What a DGX Spark actually draws

NVIDIA’s published numbers for the DGX Spark and Tom’s Hardware’s review measurements:

State	Wall draw	What it means
Idle headless	~22 W	After the post-launch software update, the machine sits quietly. Less than a single old incandescent lightbulb (60 W).
Idle with 4K display	~25-35 W	A connected monitor adds 3-13 W depending on resolution and refresh rate.
Active inference	~160 W	When the GPU is actually working through tokens. A Mistral Small 4 generation request hits this range while it’s running.
Peak system PSU	240 W	The rated maximum. Inference workloads on this machine don’t sustain peak; they spike to ~160 W during generation, then drop back.

The interesting number for monthly cost is not the peak. It’s the average over a month, which depends entirely on how often the GPU is actually working versus sitting idle. That’s what the duty-cycle math below sorts out.

How that compares to things you already know

The most relatable comparisons, all rounded:

Old-style 60 W incandescent bulb: more power than DGX Spark idle. Three LED bulbs (8-10 W each) draw the same as the machine sitting idle.
Modern 100-200 W fridge (continuous average): roughly the same as DGX Spark during active inference.
2000 W electric kettle: 12x what DGX Spark draws under inference, but only for 2-3 minutes at a time.
1500-2000 W hair dryer (continuous): 10x DGX Spark inference draw, but you don’t run it 24/7.
5-10 W Raspberry Pi 5: a fraction of even DGX Spark idle, but obviously not running a 119B parameter model.

The mental model that helps: a self-hosted AI box at idle is one LED ceiling light. Under load, it’s a fridge running its compressor. Neither of those is a scary number.

Three duty-cycle scenarios

The honest cost number depends on how often you’re actually inferring versus the machine sitting at idle. Three reference scenarios:

Scenario 1: Hobbyist (~10% duty cycle)

You run a few queries a day, maybe a longer agent session in the evening. The machine sits idle most of the time but stays on for instant access.

2.4 hours/day at 160 W = 0.38 kWh
21.6 hours/day at 25 W = 0.54 kWh
Daily total: ~0.92 kWh, monthly: ~28 kWh

Scenario 2: Daily driver (~50% duty cycle)

You use the machine for actual work most of your waking hours: agent loops running, document analysis, code generation across multiple sessions.

12 hours/day at 160 W = 1.92 kWh
12 hours/day at 25 W = 0.3 kWh
Daily total: ~2.2 kWh, monthly: ~67 kWh

Scenario 3: Production agent fleet (100% duty cycle)

The machine is hosting MCP tools or an agent backend that gets called continuously. Inference is essentially always running.

24 hours/day at 160 W = 3.84 kWh
Daily total: ~3.84 kWh, monthly: ~115 kWh

For most readers, scenario 1 or 2 is the realistic one. Scenario 3 only makes sense once the machine is monetized through MCP calls or an agent-as-a-service offering, which is a whole other discussion.

What that costs in Germany vs the US

German residential electricity in 2026 averages 32-37 ct/kWh across new and existing contracts, per BDEW and Verivox data. US residential averages around 17.65 ¢/kWh in 2026 per EIA, but with massive regional spread: North Dakota at 11.64 ¢, Hawaii at 43 ¢, Massachusetts at 31.51 ¢ (state-by-state ranking).

Scenario	DE (35 ct/kWh)	US average (17.65 ¢)	US cheapest (ND 11.64 ¢)	US most expensive (HI 43 ¢)
Hobbyist (28 kWh/mo)	€9.80	$4.94	$3.26	$12.04
Daily driver (67 kWh/mo)	€23.45	$11.83	$7.80	$28.81
Production (115 kWh/mo)	€40.25	$20.30	$13.39	$49.45

Same machine, same Mistral Small 4 deployment, same engineering choices. Cost varies by 4x between the cheapest and most expensive geography. That is the geography multiplier nobody tells you about when they pitch self-hosted AI as economically obvious.

How this compares to Bitcoin mining

For readers coming from a Bitcoin self-custody background, the comparison that lands hardest: a single Bitmain Antminer S21 draws 3,500 W continuously. That is roughly 22 DGX Sparks running at full inference simultaneously, or 140 DGX Sparks at idle. The Antminer’s monthly draw at 100% duty cycle is ~2,520 kWh, versus the DGX Spark’s 115 kWh at the same duty cycle.

In German residential terms, one Antminer S21 costs around €882 per month in electricity. In the cheapest US states, around $293/month. Bitcoin mining at home stopped making economic sense for most people years ago because the electricity dwarfs everything else. Self-hosted AI on the kind of hardware this blog discusses is not in that league. The DGX Spark draw is closer to a desktop gaming PC than to mining hardware.

That mental reframe is worth keeping: when someone says “self-hosted AI burns through electricity”, they are usually thinking of mining-scale loads. The actual number is much smaller, and on a duty-cycled stack it is comfortably household-appliance scale, not industrial scale.

How many solar panels would it take?

This is the question that surprised me when I worked it out. A modern 400 W rooftop solar panel produces roughly:

In Germany (3.5-4.5 peak sun hours daily average): ~1.6 kWh/day, ~584 kWh/year per panel (data)
US average (5 peak sun hours): ~2 kWh/day, ~730 kWh/year per panel
US Southwest (6+ peak sun hours): 700+ kWh/year per panel

For each duty-cycle scenario, panels needed to fully offset (annual basis, ignoring battery storage and grid feed-in):

Scenario	Annual kWh	Panels needed in DE	Panels needed in US average	Panels in US Southwest
Hobbyist	~336	1 panel covers it	1 panel covers it	1 panel covers it
Daily driver	~804	2 panels	2 panels	2 panels
Production 24/7	~1,380	3 panels	2 panels	2 panels

That is the punchline most people miss: one to three standard rooftop solar panels is the entire offset for a self-hosted AI machine, depending on usage and geography. Compare to a single Bitcoin miner needing roughly 30-40 panels in Germany to offset, or a typical US household needing 15-25 panels to offset total consumption. Self-hosted AI is solar-friendly in a way that mining never was.

The catch: solar is a daytime resource, AI workloads can run 24/7. Without a battery, solar offsets the inference you do during the day plus contributes to grid feed-in for the rest. Adding battery storage to actually be off-grid for the AI stack adds significant capital cost (roughly €5,000-15,000 for a meaningful home battery), and that math only makes sense in the context of a whole-home solar+battery setup, not for the AI machine alone.

Best practices to keep the bill reasonable

A few decisions that reduce the monthly cost without affecting what you can do with the machine:

Suspend or shut down between sessions. Going from “always on at idle” to “sleep when not in use” cuts 16-21 hours of idle draw. Wake-on-LAN works on the DGX Spark and brings the machine back in seconds. For a hobbyist scenario this can drop monthly kWh from 28 to under 10.
Headless mode without a monitor. Saves 3-13 W continuously by not driving a display. Use SSH or a remote desktop for sessions instead. The post-update DGX Spark idles at 22 W headless versus 35 W with a 4K panel attached.
Batch inference where possible. A 10-minute burst at full GPU draw uses less total energy than 30 minutes of half-utilized GPU. The runtime overhead of starting and warming up SGLang is small once it’s been started; once the model is loaded, batched requests are more energy-efficient than scattered single-request workloads.
One service at a time on shared hardware. The SGLang setup post covers the operational rule that SGLang, Voxtral (TTS), and ComfyUI cannot share GPU memory simultaneously. Running them sequentially instead of trying to load multiple is the right discipline regardless of energy concerns; it also keeps idle draw cleaner.
Pick your geography honestly. If you are in the US Northeast or Hawaii, the local stack is closer to cloud-cost-equivalent than a national-average comparison would suggest. If you are in the US South Central or rural Midwest, self-hosting is meaningfully cheaper than cloud APIs at sustained load. In Germany the stack is more expensive in absolute terms but still wins on privacy and latency at any usage level.

What I actually run

For reference, my own DGX Spark sits roughly at scenario 2 (~50% duty cycle on average), running SGLang Mistral Small 4 most of the workday with batched agent calls, and idling overnight rather than fully shutting down. At German prices that lands around €20-25/month in electricity, which is the cost of one cloud-API top-up that I no longer need. The privacy and latency benefits are the actual reason I run it locally; the cost calculation is the convenient justification, not the primary one.

The geography note matters here too. If I lived in Texas or Idaho, the same stack would cost me roughly half as much per month. If I lived in Hawaii, it would cost about the same as Germany. The hardware decision is not really separate from the geography decision once you start to look at sustained usage costs.

Where to next

If you want the broader operational stack this electricity discussion sits inside, the Self-Hosted AI: Start Here hub covers the hardware tree, the inference engine choice, and what hurts most in the first three months on this kind of stack.

For the actual SGLang setup and the duty-cycle discipline that keeps the GPU draw clean (one service at a time, sequential not parallel), the Mistral Small 4 with SGLang setup post is the operational follow-up.

For the broader strategic context on who actually buys this kind of hardware and what the agentic-economy pivot looks like that justifies running it 24/7, the agentic economy pivot post is the strategic backdrop where the electricity discussion originally lived in shorter form.