#strategy | Sovereign AI Blog

Emanator is a full Buffer clone for Nostr with Rails, database, and background jobs. Our Python script is 492 lines, zero dependencies. Continuation of the Anti-Slop article: what we borrow, what we don't, and why Sovereign AI often means less code.

Jul 21, 2026

Nostr Scheduling: Homemade vs. nostr-emanator, A Comparison

Emanator is a full Buffer clone for Nostr with Rails, database, and background jobs. Our Python script is 492 lines, zero dependencies. Continuation of the Anti-Slop article: what we borrow, what we don't, and why Sovereign AI often means less code.

Qwen3.6-35B built a working SvelteKit 5 + D3.js app from a single prompt. The build passes. The features ship. But the architecture is improvised, the code is duplicated, and it took eight debugging iterations to get here. Here's what the gap between 'works' and 'good' actually costs.

Jul 4, 2026

qwenopencodebenchmarking

I Let Qwen3.6 Build a Full-Stack App. It Worked. I Wasn't Satisfied.

Qwen3.6-35B built a working SvelteKit 5 + D3.js app from a single prompt. The build passes. The features ship. But the architecture is improvised, the code is duplicated, and it took eight debugging iterations to get here. Here's what the gap between 'works' and 'good' actually costs.

A self-hosted stack still hits two or three tasks where a frontier model wins. Buying that access from Anthropic means a KYC account and a card. ppq.ai is the other door: an OpenAI-compatible proxy to Claude, GPT and others, paid per query over Bitcoin Lightning, no account. Here is what it is good for, where it betrays the sovereign premise, and exactly how I wired it as the fallback behind local Qwen.

Jun 26, 2026

lightningagents

Frontier AI on Bitcoin: ppq.ai as the No-KYC Cloud Fallback for a Sovereign Stack (2026)

A self-hosted stack still hits two or three tasks where a frontier model wins. Buying that access from Anthropic means a KYC account and a card. ppq.ai is the other door: an OpenAI-compatible proxy to Claude, GPT and others, paid per query over Bitcoin Lightning, no account. Here is what it is good for, where it betrays the sovereign premise, and exactly how I wired it as the fallback behind local Qwen.

Gemma-4-31B in NVIDIA's NVFP4 format fits a single DGX Spark and is a strong reasoner. But on Blackwell sm_121 the default FP4 kernel path is broken, and a dense 31B is bandwidth-bound at around 4 tok/s no matter what you do. I measured the baseline, the Marlin fix, and the honest conclusion: the real speedup is a model swap, not a flag.

Jun 25, 2026

dgx-sparkbenchmarkingvllm

Gemma-4-31B NVFP4 on a Single DGX Spark: When the Quantization Is the Bottleneck

Gemma-4-31B in NVIDIA's NVFP4 format fits a single DGX Spark and is a strong reasoner. But on Blackwell sm_121 the default FP4 kernel path is broken, and a dense 31B is bandwidth-bound at around 4 tok/s no matter what you do. I measured the baseline, the Marlin fix, and the honest conclusion: the real speedup is a model swap, not a flag.

GLM-4.7-Flash is a 30B-A3B MoE coding model that fits a single 128GB DGX Spark with room to spare. Bringing it up on Blackwell sm_121 took two failures that every published recipe gets wrong: the 'AWQ' build is actually compressed-tensors, and the model speaks MLA, so flash_attn is illegal. Here is the working recipe, the single-stream decode number nobody reports, and what it does to my coding agent.

Jun 25, 2026

dgx-sparkbenchmarkingvllmmcp

GLM-4.7-Flash on a Single DGX Spark: the Repo Says AWQ, the Model Says MLA

GLM-4.7-Flash is a 30B-A3B MoE coding model that fits a single 128GB DGX Spark with room to spare. Bringing it up on Blackwell sm_121 took two failures that every published recipe gets wrong: the 'AWQ' build is actually compressed-tensors, and the model speaks MLA, so flash_attn is illegal. Here is the working recipe, the single-stream decode number nobody reports, and what it does to my coding agent.

Three local, self-hostable coding-agent CLIs that drive your own vLLM models instead of a cloud API: opencode, goose, and vibe. I run opencode as primary and goose as backup on a DGX Spark, and I retired vibe. Here is the decision, with the licences, the maintenance reality, and the one config gotcha each, so you can choose for your own box.

Jun 25, 2026

agentsopencodedgx-spark

goose vs vibe vs opencode: Picking a Local Coding CLI for a Sovereign vLLM Stack (2026)

Three local, self-hostable coding-agent CLIs that drive your own vLLM models instead of a cloud API: opencode, goose, and vibe. I run opencode as primary and goose as backup on a DGX Spark, and I retired vibe. Here is the decision, with the licences, the maintenance reality, and the one config gotcha each, so you can choose for your own box.

How the /learn glossary on sovgrid.org works, why it links itself, what makes each entry more than a definition, and the design calls I argued myself out of: tooltip versus link, /glossary versus /learn, and merging it into the book.

Jun 21, 2026

engineering-honestyauthority

Building /learn: a reference layer, and the options I rejected

How the /learn glossary on sovgrid.org works, why it links itself, what makes each entry more than a definition, and the design calls I argued myself out of: tooltip versus link, /glossary versus /learn, and merging it into the book.

gpt-oss-120b pulls nearly four million downloads a month, so I assumed it was a one-command experience. Getting it to serve on a DGX Spark took a frozen box, a 25GB image pull strangled by a Tor proxy, and a 43-minute kernel compile. Then the measurement: on my own coding tasks the 120B scored 56 percent where the 35B Qwen I already run scored 100. Here is the full teardown, with every number measured on the box and the failed measurements thrown out, not published.

Jun 12, 2026

dgx-sparkcomparisonbenchmarkingvllm

I Built OpenAI's gpt-oss-120b on a Single DGX Spark. My 35B Qwen Out-Coded It.

gpt-oss-120b pulls nearly four million downloads a month, so I assumed it was a one-command experience. Getting it to serve on a DGX Spark took a frozen box, a 25GB image pull strangled by a Tor proxy, and a 43-minute kernel compile. Then the measurement: on my own coding tasks the 120B scored 56 percent where the 35B Qwen I already run scored 100. Here is the full teardown, with every number measured on the box and the failed measurements thrown out, not published.

NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.

Jun 11, 2026

dgx-sparkbenchmarkingmcpvllm

I Ran NVIDIA's 120B Nemotron on a Single DGX Spark. It Is Smart, Slow, and Surprisingly Good at One Job

NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.

I built a small, dependency-free harness that answers one question with numbers instead of vibes: does this enhancement make my agent measurably better, on my models, on my tasks? Here is the method, what I found, and why deterministic gates are the whole point.

Jun 10, 2026

agentsopencodebenchmarking

Agent-bench: stop trusting install counts, start measuring your agent's tools

I built a small, dependency-free harness that answers one question with numbers instead of vibes: does this enhancement make my agent measurably better, on my models, on my tasks? Here is the method, what I found, and why deterministic gates are the whole point.

I run Qwen3.6-35B at 4.75-bit for coding. A 4.0-bit AutoRound build promised more speed. Fewer bits usually means a dumber model, so I measured both halves: decode throughput and coding quality, the latter through my own agent-bench harness. The result settled it. Here is the duel, the bandwidth math, and why the bit count was the wrong thing to fear.

Jun 10, 2026

qwendgx-sparkbenchmarking

Smaller, Faster, Still Smart? AutoRound int4 vs PrismaQuant for a Self-Hosted Coding Model

I run Qwen3.6-35B at 4.75-bit for coding. A 4.0-bit AutoRound build promised more speed. Fewer bits usually means a dumber model, so I measured both halves: decode throughput and coding quality, the latter through my own agent-bench harness. The result settled it. Here is the duel, the bandwidth math, and why the bit count was the wrong thing to fear.

caveman has ~200k installs and claims 75% token reduction. I measured it on two local models and three Claude frontiers (Sonnet 4.6, Opus 4.8, Fable 5). The math does not work out the way the claim says it does.

Jun 10, 2026

agentsbenchmarking

Caveman: does the 75% token-saving skill survive contact with a self-hosted model?

caveman has ~200k installs and claims 75% token reduction. I measured it on two local models and three Claude frontiers (Sonnet 4.6, Opus 4.8, Fable 5). The math does not work out the way the claim says it does.

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.

Jun 10, 2026

mcpagentsbenchmarking

Does Serena help a self-hosted coding model? I benchmarked it

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.

Monitoring one VPS with a Prometheus stack is like hiring a security team for a garden shed. I wrote a 315-line bash script instead: one SSH session, twelve checks, one morning notification. Here is the design, the honest comparison against the usual suspects, and why detect-and-alert beats auto-fix at this scale.

Jun 10, 2026

devops

vps-healthcheck: Twelve Daily Checks, One SSH Session, One Notification

Monitoring one VPS with a Prometheus stack is like hiring a security team for a garden shed. I wrote a 315-line bash script instead: one SSH session, twelve checks, one morning notification. Here is the design, the honest comparison against the usual suspects, and why detect-and-alert beats auto-fix at this scale.

Day 2 of the TTS spike was supposed to be Higgs Audio v2. Instead a 1.7B model nobody invited jumped the queue, scored 8/10 by ear, then split three ways across the leaderboards. A case study in which benchmark to trust.

Jun 7, 2026

podcasttts

TTS Spike Day 2: My Ears, the Vendor, and the Arena Disagree on Qwen3-TTS

Day 2 of the TTS spike was supposed to be Higgs Audio v2. Instead a 1.7B model nobody invited jumped the queue, scored 8/10 by ear, then split three ways across the leaderboards. A case study in which benchmark to trust.

May 25, 2026

authoritylightning

Refusing the Subscription Trap: A Year of V4V Lessons

Value-for-value as the monetization model for sovgrid. The architectural fact (the channel exists) versus the dollar volume (zero sats received as of the most recent ground-truth audit). The honest version of what V4V is and is not, six months in.

About a third of the people who ask me end up not buying. Six specific 'don't buy' clauses, four buyer profiles, the four real alternatives ranked, a flowchart, and the operational receipts (drop_caches=3, VLLM_FLASHINFER_MOE_BACKEND=latency, 30-minute recovery runbook) the spec sheet does not give you. Lead-magnet source; also gated as PDF.

May 23, 2026

dgx-sparkfunnel

Should You Buy a DGX Spark in 2026? The Honest Decision Tree

About a third of the people who ask me end up not buying. Six specific 'don't buy' clauses, four buyer profiles, the four real alternatives ranked, a flowchart, and the operational receipts (drop_caches=3, VLLM_FLASHINFER_MOE_BACKEND=latency, 30-minute recovery runbook) the spec sheet does not give you. Lead-magnet source; also gated as PDF.

Two days of reverse-proxy work, a full Caddy stack with Let's Encrypt TLS and basic-auth in front of opencode web, all working. Then I realized I am not the right user for it. The actual mobile answer was already on my phone, and OpenWebUI quietly took over the other half of the use case.

May 20, 2026

opencodeagentsdevops

I Built a Web UI for Mobile Coding. Termux Won Anyway.

Two days of reverse-proxy work, a full Caddy stack with Let's Encrypt TLS and basic-auth in front of opencode web, all working. Then I realized I am not the right user for it. The actual mobile answer was already on my phone, and OpenWebUI quietly took over the other half of the use case.

HackerNoon ranks coding LLMs by programming language. WhatLLM.org aggregates LiveCodeBench, Terminal-Bench and SciCode. Neither tests self-hosted models on real hardware. A self-hoster's reading protocol for coding leaderboards.

May 20, 2026

Three Coding Leaderboards, Three Blind Spots: What HackerNoon and WhatLLM Don't Tell Self-Hosters

HackerNoon ranks coding LLMs by programming language. WhatLLM.org aggregates LiveCodeBench, Terminal-Bench and SciCode. Neither tests self-hosted models on real hardware. A self-hoster's reading protocol for coding leaderboards.

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.

May 18, 2026

nostragentsmcpdgx-spark

FIPS, the Mesh Protocol, and Why I Need to Build It to Believe It

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.

I almost published 'Mistral Small 4 scores 0/30 on coding, the quant kills it'. A competent model scoring exactly zero should have been the red flag. The benchmark harness was hanging behind this stack's Tor docker proxy and never reached the model. Here is the broken-ruler story, the direct measurement that replaced it, and every Mistral-vs-Qwen3.6 number at a glance, including which one can actually read an image.

May 18, 2026

qwenmistraldgx-sparkdevops

Mistral vs Qwen3.6 on DGX Spark: the 0/30 That Was a Broken Ruler

I almost published 'Mistral Small 4 scores 0/30 on coding, the quant kills it'. A competent model scoring exactly zero should have been the red flag. The benchmark harness was hanging behind this stack's Tor docker proxy and never reached the model. Here is the broken-ruler story, the direct measurement that replaced it, and every Mistral-vs-Qwen3.6 number at a glance, including which one can actually read an image.

This blog gates every article behind one Python scorer before it publishes. I gave Qwen3.6 and Mistral Small 4 the same brief, the Start Here hub article this site still owes, and ran the raw output through that real gate with no editing. Both passed. Both invented hardware, processes, and benchmarks the scorer counted as quality. Here is the full method, the two source texts, and why a passing score is a floor and not a truth filter.

May 18, 2026

qwenmistraldgx-sparkdevops

The Quality Gate That Rewards Fabrication: I Had Qwen and Mistral Write This Blog

This blog gates every article behind one Python scorer before it publishes. I gave Qwen3.6 and Mistral Small 4 the same brief, the Start Here hub article this site still owes, and ran the raw output through that real gate with no editing. Both passed. Both invented hardware, processes, and benchmarks the scorer counted as quality. Here is the full method, the two source texts, and why a passing score is a floor and not a truth filter.

Six traits I keep seeing across the people who fit the sovereign-engineer description: they argue with specs, name every dependency, default to publishing, plan in decade arcs while shipping weekly, price friction honestly, and gate their optimism. Written from the outside, by the operator who runs the iron the sovereign software eventually touches.

May 18, 2026

nostrdgx-spark

The Quiet Pattern Among Sovereign Engineers

Six traits I keep seeing across the people who fit the sovereign-engineer description: they argue with specs, name every dependency, default to publishing, plan in decade arcs while shipping weekly, price friction honestly, and gate their optimism. Written from the outside, by the operator who runs the iron the sovereign software eventually touches.

Eleven VibeVoice renders, one Voxtral baseline, the operator's ears. The first day of the TTS spike that follows the V6=0/10 verdict. Engineering-log shape, with the actual audio embedded. Day 2 went to a late entrant, Qwen3-TTS.

May 13, 2026

podcasttts

TTS Spike Day 1: VibeVoice Sample Matrix on DGX Spark

Eleven VibeVoice renders, one Voxtral baseline, the operator's ears. The first day of the TTS spike that follows the V6=0/10 verdict. Engineering-log shape, with the actual audio embedded. Day 2 went to a late entrant, Qwen3-TTS.

May 12, 2026

nostr

How to Auto-Post on Nostr Without Reading Like a Bot

Five posts a week, no marketing department, no template-substitution. Building a Nostr distribution cadence for a self-hosted blog that does not embody what readers can spot in two scrolls.

Eight engineering fixes deep, three weeks of patches, two failure modes on the same engine. The Voxtral open checkpoint has no path to release-quality podcast audio. The drama of staying with it anyway, and the three engines I plan to spike next.

May 12, 2026

podcastttsvoxtral

Voxtral Capped at 3/10: Picking the Next Open TTS

Eight engineering fixes deep, three weeks of patches, two failure modes on the same engine. The Voxtral open checkpoint has no path to release-quality podcast audio. The drama of staying with it anyway, and the three engines I plan to spike next.

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.

May 11, 2026

mcp

Why 334 Unique IPs Was Really 5 Services in Trench Coats

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.

May 11, 2026

mcpdevops

How to Read the Insights Dashboard for a DGX-Spark Business, Not a Hobby Blog

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.

Qwen3.6-35B-A3B PrismaQuant at 95 tok/s on a single Spark (Spark Arena rank 4) beats my measured Mistral Small 4 at 35 tok/s by 2.7x on paper. This is the plan, not the result. SWE-Bench scores, opencode replacing vibe, why Mistral stays installed for creative prose, the Hacker News critiques on opencode I take seriously, and the two-day prep before day-2 measurements land on 2026-05-25.

May 11, 2026

dgx-spark

Spark Arena Rank 4 Made Me Add Qwen3.6 to My DGX Spark

Qwen3.6-35B-A3B PrismaQuant at 95 tok/s on a single Spark (Spark Arena rank 4) beats my measured Mistral Small 4 at 35 tok/s by 2.7x on paper. This is the plan, not the result. SWE-Bench scores, opencode replacing vibe, why Mistral stays installed for creative prose, the Hacker News critiques on opencode I take seriously, and the two-day prep before day-2 measurements land on 2026-05-25.

Rendering a 367-character podcast turn as one Voxtral call takes 21 seconds. Split into 90-character chunks: 35 seconds. Same words, same voice, 38 percent more wallclock.

May 7, 2026

devopspodcastttsvoxtral

Voxtral Chunk Strategy: 38 Percent Faster Render with Whole Turns

Rendering a 367-character podcast turn as one Voxtral call takes 21 seconds. Split into 90-character chunks: 35 seconds. Same words, same voice, 38 percent more wallclock.

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.

May 7, 2026

mcp

I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.

May 4, 2026

dgx-spark

How Much Electricity Does Self-Hosted AI Actually Use? Lightbulbs, Bitcoin Miners, and Solar Panels

What a DGX Spark actually draws from the wall, what that costs in Germany versus the US, how it compares to a lightbulb and a Bitcoin miner, and how many solar panels would offset it. With sources.

Cipherfox and Hexabella post curated content without human oversight, using Mistral Small 4 on a DGX Spark and a hardened signing service. Here’s how it works today.

May 3, 2026

mistralnostr

How Two Sovereign AI Personas Run Your Blog and Nostr Feed

Cipherfox and Hexabella post curated content without human oversight, using Mistral Small 4 on a DGX Spark and a hardened signing service. Here’s how it works today.

How sovgrid.org structures its most important posts to guide readers and shape the blog’s identity.

May 3, 2026

mistral

Hub Articles Protocol: How Three Reading-Paths Earn Their Homepage Slot

How sovgrid.org structures its most important posts to guide readers and shape the blog’s identity.

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.

May 3, 2026

mcpsglang

MCP Registry Distribution: Submission Plan & Tracking

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.

A no-BS breakdown of the gaps in a self-hosted AI stack and the exact next steps to plug them.

May 3, 2026

giteamistralopenclaw

OpenClaw: What’s Still Missing for Full Usability

A no-BS breakdown of the gaps in a self-hosted AI stack and the exact next steps to plug them.

Three Nostr identities, a working zap-attribution pipeline, 44 articles live at the time of writing, and after 30 days exactly zero zaps. What I learned about V4V on a small technical blog.

May 3, 2026

lightningnostr

Building Per-Article Zap Tracking on Nostr, and Then Getting Zero Zaps

Three Nostr identities, a working zap-attribution pipeline, 44 articles live at the time of writing, and after 30 days exactly zero zaps. What I learned about V4V on a small technical blog.

Mainstream AI coverage cites only one leaderboard. arena.ai ranks quality. spark-arena.com ranks throughput on real hardware. The decision that matters lives in the third column nobody publishes.

Apr 29, 2026

mistral

Two Leaderboards Nobody Reads Together: Why arena.ai Doesn't Tell You About Self-Hosted AI

Mainstream AI coverage cites only one leaderboard. arena.ai ranks quality. spark-arena.com ranks throughput on real hardware. The decision that matters lives in the third column nobody publishes.

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

Apr 28, 2026

mcpnostrpodcast

Sovereign AI Grid: What's Working and What Comes Next

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.

Apr 27, 2026

mcpsglang

A Self-Hosted AI Blog That Serves Both Humans and Machines

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.

Apr 26, 2026

mcpmistralsglang

From Blog to Agent Tools: How One Knowledge Base Powers Both Humans and AI

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.

Apr 25, 2026

mcpmistral

Running a 119B AI Model at Home: Who Actually Does This in 2026

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.

A hands-on comparison of AI coding tools testing local inference vs cloud dependency for privacy-first workflows.

Apr 24, 2026

mistralvibe

Hands-on AI Coding Tools: Why I Kept Claude Code + Vibe and Dumped Cursor and Continue.dev

A hands-on comparison of AI coding tools testing local inference vs cloud dependency for privacy-first workflows.

A deep dive into optimizing Mistral Small 4 for local technical blogging, with practical solutions for session memory, image generation, and EEAT compliance.

Apr 23, 2026

mistralvibe

Six Weeks Running Mistral Small 4 as a Production Tool: What I Actually Learned

A deep dive into optimizing Mistral Small 4 for local technical blogging, with practical solutions for session memory, image generation, and EEAT compliance.

A full-system review of our quality scoring pipeline against a rigorous philosophical framework. Three things it confirms, two things it exposes, and one concrete fix that changes the architecture.

Apr 22, 2026

Content Quality in the AI Age: Where Our Scoring System Is Right, Wrong, and Missing

A full-system review of our quality scoring pipeline against a rigorous philosophical framework. Three things it confirms, two things it exposes, and one concrete fix that changes the architecture.

Two Days From Localhost to Production: Building a Hybrid Sovereign AI Site

Nostr Scheduling: Homemade vs. nostr-emanator, A Comparison

I Let Qwen3.6 Build a Full-Stack App. It Worked. I Wasn't Satisfied.

Frontier AI on Bitcoin: ppq.ai as the No-KYC Cloud Fallback for a Sovereign Stack (2026)

Gemma-4-31B NVFP4 on a Single DGX Spark: When the Quantization Is the Bottleneck

GLM-4.7-Flash on a Single DGX Spark: the Repo Says AWQ, the Model Says MLA

goose vs vibe vs opencode: Picking a Local Coding CLI for a Sovereign vLLM Stack (2026)

Building /learn: a reference layer, and the options I rejected

I Built OpenAI's gpt-oss-120b on a Single DGX Spark. My 35B Qwen Out-Coded It.

I Ran NVIDIA's 120B Nemotron on a Single DGX Spark. It Is Smart, Slow, and Surprisingly Good at One Job

Agent-bench: stop trusting install counts, start measuring your agent's tools

Smaller, Faster, Still Smart? AutoRound int4 vs PrismaQuant for a Self-Hosted Coding Model

Caveman: does the 75% token-saving skill survive contact with a self-hosted model?

Does Serena help a self-hosted coding model? I benchmarked it

vps-healthcheck: Twelve Daily Checks, One SSH Session, One Notification

TTS Spike Day 2: My Ears, the Vendor, and the Arena Disagree on Qwen3-TTS

Refusing the Subscription Trap: A Year of V4V Lessons

Should You Buy a DGX Spark in 2026? The Honest Decision Tree

I Built a Web UI for Mobile Coding. Termux Won Anyway.

Three Coding Leaderboards, Three Blind Spots: What HackerNoon and WhatLLM Don't Tell Self-Hosters

FIPS, the Mesh Protocol, and Why I Need to Build It to Believe It

Mistral vs Qwen3.6 on DGX Spark: the 0/30 That Was a Broken Ruler

The Quality Gate That Rewards Fabrication: I Had Qwen and Mistral Write This Blog

The Quiet Pattern Among Sovereign Engineers

TTS Spike Day 1: VibeVoice Sample Matrix on DGX Spark

How to Auto-Post on Nostr Without Reading Like a Bot

Voxtral Capped at 3/10: Picking the Next Open TTS

Why 334 Unique IPs Was Really 5 Services in Trench Coats

How to Read the Insights Dashboard for a DGX-Spark Business, Not a Hobby Blog

Spark Arena Rank 4 Made Me Add Qwen3.6 to My DGX Spark

Voxtral Chunk Strategy: 38 Percent Faster Render with Whole Turns

I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

How Much Electricity Does Self-Hosted AI Actually Use? Lightbulbs, Bitcoin Miners, and Solar Panels

How Two Sovereign AI Personas Run Your Blog and Nostr Feed

Hub Articles Protocol: How Three Reading-Paths Earn Their Homepage Slot

MCP Registry Distribution: Submission Plan & Tracking

OpenClaw: What’s Still Missing for Full Usability

Building Per-Article Zap Tracking on Nostr, and Then Getting Zero Zaps

Two Leaderboards Nobody Reads Together: Why arena.ai Doesn't Tell You About Self-Hosted AI

Sovereign AI Grid: What's Working and What Comes Next

A Self-Hosted AI Blog That Serves Both Humans and Machines

From Blog to Agent Tools: How One Knowledge Base Powers Both Humans and AI

Running a 119B AI Model at Home: Who Actually Does This in 2026

Hands-on AI Coding Tools: Why I Kept Claude Code + Vibe and Dumped Cursor and Continue.dev

Six Weeks Running Mistral Small 4 as a Production Tool: What I Actually Learned

Content Quality in the AI Age: Where Our Scoring System Is Right, Wrong, and Missing