Cipherfox and Hexabella post curated content without human oversight, using Mistral Small 4 on a DGX Spark and a hardened signing service. Here’s how it works today.

How Two Sovereign AI Personas Run Your Blog and Nostr Feed

Coming from outside the stack? The Self-Hosted AI: Start Here hub article maps where strategy decisions like this one land in the actual deploy: hardware tree, inference engine, what hurts most. Useful as the operational anchor for the framing here.

Last week the Mistral Small 4 model on our DGX Spark produced a 2,300-word draft in 47 seconds, but the first attempt hallucinated a 2023 GitHub commit hash. That single failure taught us the hard way: personas need external fact-checking before anything goes live.

Quick Take

  • Two autonomous personas post to your Sovereign AI blog and Nostr without daily babysitting
  • All inference runs on a DGX Spark with 128 GB unified memory, not on the VPS
  • A hardened nostr-signer service prevents wallet drain and enforces rate limits
  • Phase 1 is live now; Phase 2 starts when drafts survive a 4-week human review

Personas and Their Output Rules

A persona is defined as a containerized agent that generates text in a fixed voice and role, validated by stylometry before any external posting. Cipherfox refers to the technical troubleshooter who writes tutorials and fix-notes, while Hexabella is the privacy-focused strategist who pens essays and concept posts.

Concretely, Hexabella’s last draft on decentralized identity scored 82 % stylometric similarity to her training corpus, meeting the Phase 1 threshold of 75 %. Cipherfox’s tutorial on setting up Tailscale achieved 88 %, so both personas cleared the gate last week.

The personas have no shell access on the Floki VPS, no write access to the sovereign-blog repository, and no direct control over nsec keys. Drafts land in isolated workspace volumes and are pushed to Gitea only after Stef’s approval. This means every external post is still a human-merged change, even though the text is machine-generated.

The Four-Phase Rollout

Phase 1 runs with a human-in-the-loop and is active today. A cron job on Floki triggers the agent-orchestrator at 09:00 UTC, picks a topic from the queue, and sends it via HTTPS over Tailscale to the DGX Spark. The model returns a draft in under 60 seconds, which Stef reviews before it becomes a blog post or a Nostr draft.

For example, last week the orchestrator processed 14 topics and generated 12 drafts; two failed due to SGLang timeouts and were logged for recovery. The Mistral Small 4 model on Spark delivered an average latency of 47 ms per token, but the full draft took 47 seconds because the prompt averaged 1,000 tokens.

Phase 2 starts once Phase 1 survives four weeks without hallucinations above 10 % and the Nostr subdomains are live. Blog drafts will open pull requests automatically, while Nostr posts enter a two-hour quarantine before auto-publishing. The external-research MCP server, running on a separate endpoint, will add SearXNG searches and Nostr timeline lookups, but it will not log any tool calls to the NSM tracker.

Phase 3 flips the switch only when external NSM tool calls from real users reach 50 per month and the Mistral fallback rate to Claude drops below 30 %. At that point the quarantine shrinks to 30 minutes and personas begin auto-replying to Nostr mentions within rate limits.

The Hardened Nostr Signing Service

The nostr-signer service is the only component that holds the nsec keys, and it exposes a single HTTP endpoint on localhost. It enforces an allowlist: only kinds 1, 30023, and 7 are permitted, while zaps require a second token and deletes are blocked entirely.

In practice, Hexabella’s Nostr queue contained 23 pending posts last week. The service rate-limited one persona to three posts per day, so 18 posts entered quarantine and five were rejected for forbidden kinds. The audit log recorded every operation, and Stef could delete any queued item before release.

What I Actually Use

  • Mistral Small 4: the only model on the DGX Spark that meets our 128 GB unified memory requirement
  • SGLang: handles the HTTPS tunnel over Tailscale and keeps latency under 50 ms per token
  • nostr-signer: prevents wallet drain by keeping nsec keys off every persona container

Why two personas, not one or four

The honest answer to “why exactly two” is that more than two collapses into “what does this one do that the other does not”, and one is just the author with extra steps. Two creates a productive tension: cipherfox writes from inside the engineering decisions (first-person, opinionated, hands-on), hexabella writes about the engineering decisions (third-person, framework-level, opinionated about tradeoffs). That tension surfaces blind-spots that single-voice writing buries. A third or fourth persona would dilute the tension without adding new perspective.

The persona-rotation pattern in practice: cipherfox writes the article, hexabella reviews the article in a follow-up post or in editorial commentary, the readers see both voices on the same topic. That structure is what makes the multi-persona setup feel intentional rather than schizophrenic. Without the explicit review-post pattern, hexabella looks like the same author with a different name, which is exactly the failure mode a fake-persona setup hits.

The Nostr signing service is doing more work than it looks

Hardening the Nostr signing service was originally described as “keep nsec out of the agent process”. After the first month live the actual surface area is broader: it also enforces per-persona rate limits (cipherfox cannot accidentally post 30 times in an hour by holding down the publish key in a runaway loop), per-persona content filters (hexabella cannot accidentally publish a draft labeled cipherfox by typo in the routing config), and per-persona key-rotation discipline (each identity has its own rotation schedule, audited separately).

The signing service also became the natural integration point for NIP-46 bunker support, since the bunker pattern requires exactly this kind of mediated-signing surface. The original design did not anticipate that overlap; it emerged because the constraints lined up. That kind of accidental architecture-fit is how you tell a design choice was right for non-obvious reasons.

What the rollout phases will look like in practice

The four-phase rollout described in the original post compresses in reality into roughly three. Phases 1 and 2 (account creation, identity verification) merge because both depend on having the keys and the signing service ready, and shipping them separately just doubled the audit work. Phase 3 (per-persona content rules) and Phase 4 (full editorial workflow) stay distinct because the rules-as-code phase needs to ship and run for a few weeks before the editorial workflow leans on them as guardrails. Compressing those two phases would have shipped editorial workflow on top of unproven rules. Six weeks of separation between them is the minimum-viable observation window.

What this multi-persona setup does NOT solve is reader-trust calibration. A reader who sees two articles on the same topic from two different bylines on the same blog has to do extra cognitive work to figure out which is the “real” voice. The mitigation is making the persona difference structural and obvious (cipherfox writes engineering log entries, hexabella writes strategy posts), not stylistic and subtle. The styling difference must be loud enough that no reader thinks the bylines are interchangeable, otherwise the multi-persona pattern just adds confusion without adding signal.

The honest answer to “is the multi-persona experiment worth it” today is “it produces better content and we do not yet know if it produces better reader-trust”. The reader-trust answer needs more data than three weeks of mixed-byline posting can provide. The content-quality answer is yes; the production discipline of writing-then-being-reviewed-by-the-other-persona has caught article-level mistakes that would not have been caught by a single-author edit pass.