← All articles

#mcp

25 articles

All articles tagged "mcp" : self-hosted AI fixes, setups, and architecture notes.

The GitHub Bot That Cannot Write

The GitHub Bot That Cannot Write

I wanted a daily read of what is happening across my public repositories without handing a cloud service write access to them. The result is a sovereign GitHub assistant that runs on my own GPU, reviews incoming pull requests with a local model, and physically cannot post to GitHub. Here is the architecture, every decision behind it, the comparison with the SaaS reviewers, and the four times the build lied to me before it told the truth.

Read article →
NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.
strategydgx-sparkbenchmarkingvllm

I Ran NVIDIA's 120B Nemotron on a Single DGX Spark. It Is Smart, Slow, and Surprisingly Good at One Job

NVIDIA's Nemotron-3-Super-120B-A12B is tuned for Blackwell and ships an NVFP4 build that fits a single 128GB DGX Spark. I measured it where almost nobody else does: single-stream, on one GB10. The result is 23.7 tok/s, a competent but painfully verbose coder, and a genuinely strong retrieval agent. Here is the full teardown, with the published benchmarks fact-checked against what the box actually did.

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.
strategyagentsbenchmarking

Does Serena help a self-hosted coding model? I benchmarked it

Serena is one of the most-installed coding MCP servers. I tested it against two local models (Qwen3.6-35b and Mistral-Small-4) on three refactor tasks with deterministic gates. The short answer is more interesting than yes or no.

Giving a local 8B model persistent memory and retrieval good enough to replace a cloud assistant for daily coding. The architecture is mem0 plus a RAG knowledge base over ChromaDB. The honest part is the two bugs that made the first version forget you and answer the wrong question with full confidence.
ollamaqwenself-hostedsovereign-aiengineering-honestyagentsrag

A Second Brain for a Local Model, and the Two Bugs That Made It Useless First

Giving a local 8B model persistent memory and retrieval good enough to replace a cloud assistant for daily coding. The architecture is mem0 plus a RAG knowledge base over ChromaDB. The honest part is the two bugs that made the first version forget you and answer the wrong question with full confidence.

A May 2026 memo of mine said local 8B models cannot reliably do MCP tool-use. I retested in late May. The memo was specifically wrong about WHY. Direct OpenAI-format API calls work fine. The bridge layer was the broken part.
lenovoblackwellrtx-5080ollamaengineering-honesty

We Were Wrong About Local 8B Tool-Use (2026 Reality Check)

A May 2026 memo of mine said local 8B models cannot reliably do MCP tool-use. I retested in late May. The memo was specifically wrong about WHY. Direct OpenAI-format API calls work fine. The bridge layer was the broken part.

Every MCP server tutorial demos search. The five patterns below are the ones that actually justify the protocol on the second day after you launch: structured-write, status-with-history, batched-action, paid-action, capability-discovery. Each has a worked example.
authorityagents

5 MCP Patterns That Aren't 'Search the Database'

Every MCP server tutorial demos search. The five patterns below are the ones that actually justify the protocol on the second day after you launch: structured-write, status-with-history, batched-action, paid-action, capability-discovery. Each has a worked example.

Six weeks from 'I should publish an MCP server' to 'the server is live, registered, scored 100/100 on Smithery, and listed in three directories.' The log is week-by-week, with the actual command lines and the actual mistakes.
authority

MCP for Engineers Who Hate Marketing: A 6-Week Build Log

Six weeks from 'I should publish an MCP server' to 'the server is live, registered, scored 100/100 on Smithery, and listed in three directories.' The log is week-by-week, with the actual command lines and the actual mistakes.

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.
strategynostragentsdgx-spark

FIPS, the Mesh Protocol, and Why I Need to Build It to Believe It

The planning post before the implementation post. FIPS is an open-source mesh protocol with cryptographic identity and transport-agnostic routing. My sovereign AI stack is sovereign at the model and the hardware, and leaks the whole workload at the network boundary. Here is why that gap matters, the five concrete pieces of work I am committing to, and why I write the plan in public before I know if it works.

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.
strategy

Why 334 Unique IPs Was Really 5 Services in Trench Coats

My MCP-server NSM page showed 334 unique agents. One change to the aggregator (User-Agent plus IP /24 dedupe) and the truth surfaced: 86% of external hits come from a single /24 range, the rest are mostly automated probes. Headline metrics that look like reach can be five services pretending to be many.

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.
strategydevops

How to Read the Insights Dashboard for a DGX-Spark Business, Not a Hobby Blog

Each number on the live Insights page has a formula, a business meaning, and a vanity-trap. If you are running a DGX Spark as the engine of a small AI service, here is how to read the dashboard daily without chasing growth-theatre, and which two metrics are the only ones worth waking up to check.

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.
strategy

I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

Wired a browser search form directly into an MCP tool that AI agents already call. One afternoon, four endpoints, zero CORS, real numbers from the deploy. The mistakes that cost me an hour are documented inline.

Why the first Sovereign AI MCP server isn't worth installing yet, but will be once it hits 200 articles and adds specialized tools. An honest MVP/POC critique.
setup

The Sovereign AI Blog MCP Is Mostly Redundant Today, And That Will Change

Why the first Sovereign AI MCP server isn't worth installing yet, but will be once it hits 200 articles and adds specialized tools. An honest MVP/POC critique.

Move your AI stack off cloud servers. This post shows how to migrate a production Sovereign AI blog and MCP server to a €163/year VPS, harden it, and run it with Docker and Caddy, complete with real configs and pitfalls.
setup

Floki-VPS Setup for Sovereign AI Workloads

Move your AI stack off cloud servers. This post shows how to migrate a production Sovereign AI blog and MCP server to a €163/year VPS, harden it, and run it with Docker and Caddy, complete with real configs and pitfalls.

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.
setupmistralpodcast

Build a Self-Hosted Knowledge Base with Plain Text and LLMs

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.

A hands-on guide to installing and configuring OpenClaw on NVIDIA DGX Spark, switching between cloud and local models, and wiring MCP servers.
setupmistralopenclawsglang

OpenClaw Setup on DGX Spark for Sovereign AI Agents

A hands-on guide to installing and configuring OpenClaw on NVIDIA DGX Spark, switching between cloud and local models, and wiring MCP servers.

Learn how to run a self-hosted MCP server for your blog’s knowledge base, integrate it with OpenClaw and Vibe, and avoid the pitfalls I hit while migrating from cloud to Sovereign AI.
setupopenclawsglangvibe

Sovereign MCP Server: Local Setup, Integration, and Hard Lessons

Learn how to run a self-hosted MCP server for your blog’s knowledge base, integrate it with OpenClaw and Vibe, and avoid the pitfalls I hit while migrating from cloud to Sovereign AI.

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.
strategysglang

MCP Registry Distribution: Submission Plan & Tracking

How we’re getting the Sovereign AI MCP endpoint listed in five registries with real traffic tracking and zero KYC friction.

Built a self-hosted MCP, mirrored it to GitHub, listed it on Smithery, hit a perfect quality score before dinner. The exact patches, badges, and pitfalls. Plus an honest take on why a number on a dashboard is not a customer.
setup

100/100 on Smithery in 4 Hours, and Why That Means Almost Nothing

Built a self-hosted MCP, mirrored it to GitHub, listed it on Smithery, hit a perfect quality score before dinner. The exact patches, badges, and pitfalls. Plus an honest take on why a number on a dashboard is not a customer.

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.
strategynostrpodcast

Sovereign AI Grid: What's Working and What Comes Next

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.
strategysglang

A Self-Hosted AI Blog That Serves Both Humans and Machines

This technical blog maintains a single source of truth while layering machine-readable tools on top, ensuring both human readers and AI agents get accurate, up-to-date information.

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.
strategymistralsglang

From Blog to Agent Tools: How One Knowledge Base Powers Both Humans and AI

Learn how to transform your technical blog into a dual-purpose knowledge base that serves both human readers and AI agents while future-proofing your content strategy.

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.
strategymistral

Running a 119B AI Model at Home: Who Actually Does This in 2026

A deep dive into the DGX Spark ecosystem, real power costs, and agent-driven tool adoption for self-hosting 119B models at home in 2026.

A hands-on guide to deploying a self-hosted AI blog with Docker, Astro, and MCP discovery, complete with working code, real-world gotchas, and monetization via Lightning and Nostr.
setuplightningnostr

Sovereign Blog Setup: Self-Hosted AI Content Pipeline & Monetization

A hands-on guide to deploying a self-hosted AI blog with Docker, Astro, and MCP discovery, complete with working code, real-world gotchas, and monetization via Lightning and Nostr.

How NVIDIA's tested playbooks transform DGX Spark into a reproducible AI development environment with pre-configured stacks, MCP integration, and battle-tested configurations.
servicesopenhands

NVIDIA Playbook Stack

How NVIDIA's tested playbooks transform DGX Spark into a reproducible AI development environment with pre-configured stacks, MCP integration, and battle-tested configurations.

How strict workflow rules and tool constraints prevent AI agents from destroying your codebase during file edits.
fixdevopsgiteamistralvibe

Vibe write_file Overwrite Bug: When Edits Silently Replace Whole Files

How strict workflow rules and tool constraints prevent AI agents from destroying your codebase during file edits.