All articles tagged "mcp" : self-hosted AI fixes, setups, and architecture notes.
I wanted a daily read of what is happening across my public repositories without handing a cloud service write access to them. The result is a sovereign GitHub assistant that runs on my own GPU, reviews incoming pull requests with a local model, and physically cannot post to GitHub. Here is the architecture, every decision behind it, the comparison with the SaaS reviewers, and the four times the build lied to me before it told the truth.
Read article →
NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.
Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.
Giving a local 8B model persistent memory and retrieval good enough to replace a cloud assistant for daily coding. The architecture is mem0 plus a RAG knowledge base over ChromaDB. The honest part is the two bugs that made the first version forget you and answer the wrong question with full confidence.
A May 2026 memo of mine said local 8B models cannot reliably do MCP tool-use. I retested in late May. The memo was specifically wrong about WHY. Direct OpenAI-format API calls work fine. The bridge layer was the broken part.
Every MCP server tutorial demos search. The five patterns below are the ones that actually justify the protocol on the second day after you launch: structured-write, status-with-history, batched-action, paid-action, capability-discovery. Each has a worked example.
Six weeks from 'I should publish an MCP server' to 'the server is live, registered, scored 100/100 on Smithery, and listed in three directories.' The log is week-by-week, with the actual command lines and the actual mistakes.
The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.
My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.
Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.
Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.
Why the first Sovereign AI MCP server isn't worth installing yet, but will be once it hits 200 articles and adds specialized tools. An honest MVP/POC critique.
Move your AI stack off cloud servers. This post shows how to migrate a production Sovereign AI blog and MCP server to a €163/year VPS, harden it, and run it with Docker and Caddy, complete with real configs and pitfalls.
A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.
A hands-on guide to installing and configuring OpenClaw on NVIDIA DGX Spark, switching between cloud and local models, and wiring MCP servers.
Learn how to run a self-hosted MCP server for your blog’s knowledge base, integrate it with OpenClaw and Vibe, and avoid the pitfalls I hit while migrating from cloud to Sovereign AI.
How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.
Built a self-hosted MCP, mirrored it to GitHub, listed it on Smithery, hit a perfect quality score before dinner. The exact patches, badges, and pitfalls. Plus an honest take on why a number on a dashboard is not a customer.
Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.
This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.
Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.
A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.
A hands-on guide to deploying a self-hosted AI blog with Docker, Astro, and MCP discovery, complete with working code, real-world gotchas, and monetization via Lightning and Nostr.
How NVIDIA's tested playbooks transform DGX Spark into a reproducible AI development environment with pre-configured stacks, MCP integration, and battle-tested configurations.
How strict workflow rules and tool constraints prevent AI agents from destroying your codebase during file edits.