#podcast | Sovereign AI Blog

Eight engineering fixes deep, three weeks of patches, two failure modes on the same engine. The Voxtral open checkpoint has no path to release-quality podcast audio. The drama of staying with it anyway, and the three engines I plan to spike next.

May 12, 2026

Voxtral Capped at 3/10: Picking the Next Open TTS

Eight engineering fixes deep, three weeks of patches, two failure modes on the same engine. The Voxtral open checkpoint has no path to release-quality podcast audio. The drama of staying with it anyway, and the three engines I plan to spike next.

The intro music wasn't playing for the first four seconds of every podcast episode. RMS at minus infinity. The fix was one keyword, eval=frame.

May 7, 2026

fixdevopsvoxtral

FFmpeg Volume Filter eval=frame: A 4-Second Silent Bug

The intro music wasn't playing for the first four seconds of every podcast episode. RMS at minus infinity. The fix was one keyword, eval=frame.

Voxtral 4B advertises voice cloning, accepts ref_audio in the API, then crashes the engine because the encoder weights live only in Mistral's hosted product.

May 7, 2026

fixmistralttsvoxtral

Voxtral 4B Open-Checkpoint: The Encoder is Gated

Voxtral 4B advertises voice cloning, accepts ref_audio in the API, then crashes the engine because the encoder weights live only in Mistral's hosted product.

Rendering a 367-character podcast turn as one Voxtral call takes 21 seconds. Split into 90-character chunks: 35 seconds. Same words, same voice, 38 percent more wallclock.

May 7, 2026

strategydevopsttsvoxtral

Voxtral Chunk Strategy: 38 Percent Faster Render with Whole Turns

Rendering a 367-character podcast turn as one Voxtral call takes 21 seconds. Split into 90-character chunks: 35 seconds. Same words, same voice, 38 percent more wallclock.

I added a numerical output contract to my Mistral prompt and watched throughput drop in half on the same hardware. Then the naturalize step in the same pipeline run hit 31 tok/s. Live SGLang logs explain why, and what to do about it.

May 6, 2026

fixdevopsmistralsglang

EAGLE Throughput Is Content-Dependent: Same Run, 14 to 31 Tokens Per Second

I added a numerical output contract to my Mistral prompt and watched throughput drop in half on the same hardware. Then the naturalize step in the same pipeline run hit 31 tok/s. Live SGLang logs explain why, and what to do about it.

How we fixed loudness pumping, markup stripping, and dialogue rhythm in a self-hosted podcast pipeline

May 3, 2026

fixdevopsvoxtral

Voxtral Podcast Audio: Mono 24 kHz Baseline and Three Compression Pitfalls

How we fixed loudness pumping, markup stripping, and dialogue rhythm in a self-hosted podcast pipeline

How a three-line Python init order bug masqueraded as a Blackwell GPU hang, and why checking raw logs beat all hardware theories.

May 3, 2026

fixdevopsmistralttsvoxtral

The 3.5-Hour Deadlock That Was Really an AttributeError

How a three-line Python init order bug masqueraded as a Blackwell GPU hang, and why checking raw logs beat all hardware theories.

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.

May 3, 2026

setupmcpmistral

Build a Self-Hosted Knowledge Base with Plain Text and LLMs

A practical guide to setting up a searchable, growing knowledge base using Markdown files, JSON indexing, and local LLMs, no vector stores required.

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.

Apr 28, 2026

strategymcpnostr

Sovereign AI Grid: What's Working and What Comes Next

Status snapshot of what is running on this stack today and what is being built next. For returning readers. New here? Read 'Self-Hosted AI: Start Here' first.