I Gave My Blog a Search Box, and It Runs Through My Own MCP Server

May 7, 2026 7 min read

Last week sovgrid.org had no search. 63 articles sat in a flat list, no way to find anything except scrolling or guessing tag URLs. One afternoon later the same MCP server that AI agents call for search_blog now serves a browser search box at /search. Four endpoints, zero CORS, one Caddy handler, one Astro widget. The MCP narrative stopped being abstract and became a thing readers actually use.

Quick Take

Added a working search box to my blog in one afternoon

Every search calls the same MCP tool that AI agents use, no parallel implementation

Same-origin Caddy proxy means no CORS, no DNS-rebinding mismatch, no service worker shenanigans

Real numbers: 156ms TTFB on /search, 209-230ms end-to-end MCP call, 0 WCAG violations

Three mistakes during the build are documented at the end so you do not repeat them

The four pieces that had to come together

The MCP protocol uses Streamable HTTP with a JSON-RPC handshake: initialize, notifications/initialized, then tools/call. Browsers can speak that, but it is overkill for a one-shot search from a form. I added a factory _make_tool_endpoint(tool_fn) in mcp-server/src/main.py that wraps any MCP tool as a flat HTTP POST endpoint:

# mcp-server/src/main.py: endpoint factory
def _make_tool_endpoint(tool_fn):
    """Expose an MCP tool as a flat HTTP POST endpoint.
    Reuses the same Pydantic validation that lives in the tool function
    via Annotated[Field(...)]. Same backend, same KB, no protocol overhead.
    """
    async def endpoint(request: Request):
        body = await request.json()
        try:
            result = await tool_fn(**body)
        except ValidationError as e:
            return JSONResponse({"error": e.errors()}, status_code=400)
        return JSONResponse(result.model_dump())
    return endpoint

app.add_api_route("/api/diagnose", _make_tool_endpoint(diagnose_sglang), methods=["POST"])
app.add_api_route("/api/search",   _make_tool_endpoint(search_blog),       methods=["POST"])
app.add_api_route("/api/tags",     _make_tool_endpoint(list_tags),         methods=["POST"])
app.add_api_route("/api/article",  _make_tool_endpoint(get_article),       methods=["POST"])

Four endpoints went live at /api/diagnose, /api/search, /api/tags, /api/article. All four MCP tools usable from a browser without protocol overhead. The /api/search endpoint accepts {"query": "MCP server", "max_results": 10} and returns a JSON array of article objects with title, date, description, relevance score; exactly what the AI agents already get.

Caddy as the same-origin gateway

Browser traffic needs to reach the MCP server without CORS. A new Caddy handler at /api/mcp-tool/<tool> rewrites to /api/<tool> and forwards to the mcp container on port 8002:

# floki/Caddyfile: same-origin MCP gateway
@mcp_tool path_regexp tool ^/api/mcp-tool/([a-z][a-z0-9_-]*)$
handle @mcp_tool {
    rewrite * /api/{re.tool.1}
    request_body {
        max_size 64KB
    }
    reverse_proxy mcp:8002 {
        header_up Host {host}
    }
}

The path regex is tightened to ^/api/mcp-tool/([a-z][a-z0-9_-]*)$ so only safe tool names match. Caddy already normalises path traversal, the regex is defense in depth. The 64 KB body cap stops accidental DoS via huge POSTs. Browser fetch goes to sovgrid.org/api/mcp-tool/search, served from the same origin: no CORS preflight, no DNS-rebinding-protection mismatch, no Service Worker quirks.

<MCPSearchWidget> is an Astro component with a form: query, optional tag, max-results. JavaScript submits to /api/mcp-tool/search, renders results as cards: title link, date, style badge, description, tag chips, subtle relevance and quality scores.

<!-- src/components/MCPSearchWidget.astro -->
<form class="mcp-search" data-endpoint="/api/mcp-tool/search">
  <input name="query" type="search" placeholder="Search articles..." />
  <select name="tag"><option value="">All tags</option></select>
  <input name="max_results" type="number" min="1" max="50" value="10" />
  <button type="submit">Search</button>
  <div class="results" aria-live="polite"></div>
</form>

<script>
  for (const form of document.querySelectorAll('form.mcp-search')) {
    const endpoint = form.dataset.endpoint;
    form.addEventListener('submit', async (e) => {
      e.preventDefault();
      const data = Object.fromEntries(new FormData(form));
      const res = await fetch(endpoint, {
        method: 'POST',
        headers: {'Content-Type': 'application/json'},
        body: JSON.stringify({...data, max_results: Number(data.max_results)})
      });
      renderResults(form.querySelector('.results'), await res.json());
    });
  }
</script>

Empty-load auto-runs an empty query so the page shows the 20 newest articles as a discovery default. URL params ?q=...&tag=... prefill the form and auto-run, so search results have shareable links. No external JS deps, vanilla fetch and DOM. The widget root carries data-endpoint so multiple instances per page just work without inline-JS variable injection.

The pages: `/search` for humans, `/agents/` for the curious

/search has a hero header, the embedded widget, and a collapsed <details> footer with the MCP pitch and links to /agents/ and mcp.sovgrid.org. Header nav gets a Search link between Blog and Insights. The pitch lives in the collapsed footer so it does not dilute the search UX, but for the curious reader the funnel into the MCP narrative is one click away.

/agents/ keeps the raw-JSON demo: same widget component, same endpoint, but rendering MCP responses literally as collapsible JSON blocks. Two layers of the same story: /search shows what the MCP can do for humans, /agents/ shows the wire format underneath. Same backend, same component, two presentations.

Why this MVP beats the original plan

I almost shipped a different MVP. The original plan had <MCPToolWidget> embedded in the SGLang setup article, calling diagnose_sglang for live config validation. That is a cute tech demo but the audience is at most the few thousand people who own a DGX Spark. Most readers would scroll past.

search_blog flips that math. Every reader has a reason to use search. Every search call goes through the MCP server. The MCP narrative gets demonstrated to everyone, not just the niche. Plus I get a real product feature, not just a demo. The blog has search now, which it should have had from the start.

The mistakes that cost me an hour

The afternoon would have been an hour without these.

CSP blocked my first script attempt. The page sets script-src 'self', no 'unsafe-inline', no nonces. My first widget used <script define:vars={{...}}> to pass per-instance config from Astro into JS, which Astro renders as an inline <script> tag. That tag got blocked, the form button did nothing on click, no console error visible to me until I opened devtools. Fix: rewrite as a plain <script> so Astro bundles it as /_astro/*.js, served from the same origin. Per-widget config moved from injected JS variables to data-endpoint attributes scanned at module init.

Caddy bind-mount inode rebinding. I copied the new Caddyfile to Floki, ran caddy reload, and the new config did not take effect. The config file inside the container had a different md5 than the file on the host, even though they were supposed to be the same bind-mount. Cause: rsync writes to a temp file then renames, which creates a new inode. Docker’s bind mount was set up against the original inode and kept reading the deleted-but-open original. Fix: replace caddy reload with docker compose restart caddy so the bind-mount re-binds. Already had this exact gotcha for nginx.conf in the blog container; the deploy script now handles both the same way.

MCP source COPY at build time, not volume-mounted. A docs change triggered docker restart sovereign-mcp which restarted the container with the OLD image, since the source is COPY src/ ./src/ in the Dockerfile, not a runtime volume. Fix: docker compose up -d --build so source changes actually ship.

Live numbers from the switch

/search page: TTFB 156ms, total 208ms, 11.7 KB HTML
/api/mcp-tool/search end-to-end: 209-230ms over three sample calls. Caddy proxy hop, MCP container, TF-IDF over 63 articles, JSON serialisation, EU-to-client round-trip
Edge-block hit rate after deploy: 0 of last 100 mcp.log lines were 429s, versus 599 of 716 from a single scraper IP over the previous five days. The scanner hammered /.env, /.git/config, /.amplifyrc, /wp-admin/; 716 GET requests over five days, 96% of all rate-limit hits, 0 useful traffic
axe-core: 0 WCAG violations across the new /search page

What this changes about the MCP narrative

Before this deploy, the MCP server on Floki served only mcp.sovgrid.org/self-hosted-ai, listed on the official MCP registry as org.sovgrid/self-hosted-ai, used by AI agents that find it via the registry’s DNS-auth flow. Real, but invisible to the average blog reader.

After the deploy, every reader who uses the search box has called the same MCP server that those agents call. The narrative stops being “we run an MCP server, here is the registry link” and becomes “you just used the MCP server, here is what AI agents see when they call it”. The footer link from /search to /agents/ to mcp.sovgrid.org walks the curious reader from human-friendly to wire-format to registry listing in three clicks.

What I Actually Use

Mistral Small 4 (NVFP4 on GB10) for the actual search ranking inside the MCP tool

FastMCP 1.27.0 as the MCP server framework with _make_tool_endpoint wrapper for HTTP-shaped tools

Astro for the static site, vanilla DOM for the widget, no client-side framework

Caddy for the same-origin proxy and the path-regex hardening

Floki VPS (FlokiNET, no-KYC) hosts the production MCP and blog

Flow

Browser to MCP, four pieces

Search box that calls the same MCP server AI agents use

Endpoint factory Wraps any MCP tool as a flat HTTP POST

Caddy handler Same-origin proxy, no CORS, body cap 64 KB

Astro widget Vanilla fetch, data-endpoint config, no inline JS

/search page Hero, widget, MCP pitch in collapsed footer

Agents page Same widget rendering raw JSON for AI agents