#mcp | Sovereign AI Blog

NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.

Jun 11, 2026

I Ran NVIDIA's 120B Nemotron on a Single DGX Spark. It Is Smart, Slow, and Surprisingly Good at One Job

NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.

Jun 10, 2026

strategyagentsbenchmarking

Does Serena help a self-hosted coding model? I benchmarked it

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.

Giving a local 8B model persistent memory and retrieval good enough to replace a cloud assistant for daily coding. The architecture is mem0 plus a RAG knowledge base over ChromaDB. The honest part is the two bugs that made the first version forget you and answer the wrong question with full confidence.

Jun 4, 2026

ollamaqwenself-hostedsovereign-aiengineering-honestyagentsrag

A Second Brain for a Local Model, and the Two Bugs That Made It Useless First

Giving a local 8B model persistent memory and retrieval good enough to replace a cloud assistant for daily coding. The architecture is mem0 plus a RAG knowledge base over ChromaDB. The honest part is the two bugs that made the first version forget you and answer the wrong question with full confidence.

A May 2026 memo of mine said local 8B models cannot reliably do MCP tool-use. I retested in late May. The memo was specifically wrong about WHY. Direct OpenAI-format API calls work fine. The bridge layer was the broken part.

Jun 1, 2026

lenovoblackwellrtx-5080ollamaengineering-honesty

We Were Wrong About Local 8B Tool-Use (2026 Reality Check)

A May 2026 memo of mine said local 8B models cannot reliably do MCP tool-use. I retested in late May. The memo was specifically wrong about WHY. Direct OpenAI-format API calls work fine. The bridge layer was the broken part.

Every MCP server tutorial demos search. The five patterns below are the ones that actually justify the protocol on the second day after you launch: structured-write, status-with-history, batched-action, paid-action, capability-discovery. Each has a worked example.

May 22, 2026

authorityagents

5 MCP Patterns That Aren't 'Search the Database'

Every MCP server tutorial demos search. The five patterns below are the ones that actually justify the protocol on the second day after you launch: structured-write, status-with-history, batched-action, paid-action, capability-discovery. Each has a worked example.

Six weeks from 'I should publish an MCP server' to 'the server is live, registered, scored 100/100 on Smithery, and listed in three directories.' The log is week-by-week, with the actual command lines and the actual mistakes.

May 22, 2026

authority

MCP for Engineers Who Hate Marketing: A 6-Week Build Log

Six weeks from 'I should publish an MCP server' to 'the server is live, registered, scored 100/100 on Smithery, and listed in three directories.' The log is week-by-week, with the actual command lines and the actual mistakes.

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.

May 18, 2026

strategynostragentsdgx-spark

FIPS, the Mesh Protocol, and Why I Need to Build It to Believe It

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.

May 11, 2026

strategy

Why 334 Unique IPs Was Really 5 Services in Trench Coats

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.

May 11, 2026

strategydevops

How to Read the Insights Dashboard for a DGX-Spark Business, Not a Hobby Blog

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.

May 7, 2026

strategy

I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.

Why the first Sovereign AI MCP server isn't worth installing yet, but will be once it hits 200 articles and adds specialized tools. An honest MVP/POC critique.

May 3, 2026

setup

The Sovereign AI Blog MCP Is Mostly Redundant Today, And That Will Change

Why the first Sovereign AI MCP server isn't worth installing yet, but will be once it hits 200 articles and adds specialized tools. An honest MVP/POC critique.

Move your AI stack off cloud servers. This post shows how to migrate a production Sovereign AI blog and MCP server to a €163/year VPS, harden it, and run it with Docker and Caddy, complete with real configs and pitfalls.

May 3, 2026

setup

Floki-VPS Setup for Sovereign AI Workloads

Move your AI stack off cloud servers. This post shows how to migrate a production Sovereign AI blog and MCP server to a €163/year VPS, harden it, and run it with Docker and Caddy, complete with real configs and pitfalls.

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.

May 3, 2026

setupmistralpodcast

Build a Self-Hosted Knowledge Base with Plain Text and LLMs

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.

A hands-on guide to installing and configuring OpenClaw on NVIDIA DGX Spark, switching between cloud and local models, and wiring MCP servers.

May 3, 2026

setupmistralopenclawsglang

OpenClaw Setup on DGX Spark for Sovereign AI Agents

A hands-on guide to installing and configuring OpenClaw on NVIDIA DGX Spark, switching between cloud and local models, and wiring MCP servers.

Learn how to run a self-hosted MCP server for your blog’s knowledge base, integrate it with OpenClaw and Vibe, and avoid the pitfalls I hit while migrating from cloud to Sovereign AI.

May 3, 2026

setupopenclawsglangvibe

Sovereign MCP Server: Local Setup, Integration, and Hard Lessons

Learn how to run a self-hosted MCP server for your blog’s knowledge base, integrate it with OpenClaw and Vibe, and avoid the pitfalls I hit while migrating from cloud to Sovereign AI.

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.

May 3, 2026

strategysglang

MCP Registry Distribution: Submission Plan & Tracking

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.

Built a self-hosted MCP, mirrored it to GitHub, listed it on Smithery, hit a perfect quality score before dinner. The exact patches, badges, and pitfalls. Plus an honest take on why a number on a dashboard is not a customer.

Apr 30, 2026

setup

100/100 on Smithery in 4 Hours, and Why That Means Almost Nothing

Built a self-hosted MCP, mirrored it to GitHub, listed it on Smithery, hit a perfect quality score before dinner. The exact patches, badges, and pitfalls. Plus an honest take on why a number on a dashboard is not a customer.

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

Apr 28, 2026

strategynostrpodcast

Sovereign AI Grid: What's Working and What Comes Next

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.

Apr 27, 2026

strategysglang

A Self-Hosted AI Blog That Serves Both Humans and Machines

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.

Apr 26, 2026

strategymistralsglang

From Blog to Agent Tools: How One Knowledge Base Powers Both Humans and AI

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.

Apr 25, 2026

strategymistral

Running a 119B AI Model at Home: Who Actually Does This in 2026

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.

A hands-on guide to deploying a self-hosted AI blog with Docker, Astro, and MCP discovery, complete with working code, real-world gotchas, and monetization via Lightning and Nostr.

Apr 20, 2026

setuplightningnostr

Sovereign Blog Setup: Self-Hosted AI Content Pipeline & Monetization

A hands-on guide to deploying a self-hosted AI blog with Docker, Astro, and MCP discovery, complete with working code, real-world gotchas, and monetization via Lightning and Nostr.

How NVIDIA's tested playbooks transform DGX Spark into a reproducible AI development environment with pre-configured stacks, MCP integration, and battle-tested configurations.

Mar 31, 2026

servicesopenhands

NVIDIA Playbook Stack

How NVIDIA's tested playbooks transform DGX Spark into a reproducible AI development environment with pre-configured stacks, MCP integration, and battle-tested configurations.

How strict workflow rules and tool constraints prevent AI agents from destroying your codebase during file edits.

Mar 22, 2026

fixdevopsgiteamistralvibe

Vibe write_file Overwrite Bug: When Edits Silently Replace Whole Files

How strict workflow rules and tool constraints prevent AI agents from destroying your codebase during file edits.

The GitHub Bot That Cannot Write

I Ran NVIDIA's 120B Nemotron on a Single DGX Spark. It Is Smart, Slow, and Surprisingly Good at One Job

Does Serena help a self-hosted coding model? I benchmarked it

A Second Brain for a Local Model, and the Two Bugs That Made It Useless First

We Were Wrong About Local 8B Tool-Use (2026 Reality Check)

5 MCP Patterns That Aren't 'Search the Database'

MCP for Engineers Who Hate Marketing: A 6-Week Build Log

FIPS, the Mesh Protocol, and Why I Need to Build It to Believe It

Why 334 Unique IPs Was Really 5 Services in Trench Coats

How to Read the Insights Dashboard for a DGX-Spark Business, Not a Hobby Blog

I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

The Sovereign AI Blog MCP Is Mostly Redundant Today, And That Will Change

Floki-VPS Setup for Sovereign AI Workloads

Build a Self-Hosted Knowledge Base with Plain Text and LLMs

OpenClaw Setup on DGX Spark for Sovereign AI Agents

Sovereign MCP Server: Local Setup, Integration, and Hard Lessons

MCP Registry Distribution: Submission Plan & Tracking

100/100 on Smithery in 4 Hours, and Why That Means Almost Nothing

Sovereign AI Grid: What's Working and What Comes Next

A Self-Hosted AI Blog That Serves Both Humans and Machines

From Blog to Agent Tools: How One Knowledge Base Powers Both Humans and AI

Running a 119B AI Model at Home: Who Actually Does This in 2026

Sovereign Blog Setup: Self-Hosted AI Content Pipeline & Monetization

NVIDIA Playbook Stack

Vibe write_file Overwrite Bug: When Edits Silently Replace Whole Files