A weekly automation pipeline that repairs, updates, and optimizes self-hosted blog articles in place using Mistral and SearXNG.

Automate Better Blog Posts: Self-Hosted Article Optimization That Actually Works


The Optimizer Script

cd /data/projects/sovereign-blog && python3 scripts/optimize_articles.py --eeat-below 4 --no-research

This command optimizes all articles with EEAT scores below 4, skipping web research for faster runs. After completion, rebuild your site:

npm run build

The script checks links, analyzes EEAT gaps, and updates articles in place while preserving your original structure and voice.


def check_external_links(content: str) -> list[tuple[str, int]]:
    broken_links = []
    for link in extract_external_links(content):
        status = head_request(link)
        if status in {404, 410, "timeout", "error"}:
            broken_links.append((link, status))
    return broken_links

The script:

  1. Scans your article for external links
  2. Sends HEAD requests to each URL
  3. Logs broken links (404/410/timeouts)
  4. Uses SearXNG to find replacements from the same domain or relevant pages
  5. If no replacement is found, removes the link but keeps the anchor text

Internal links (localhost, host.docker.internal) are skipped, they’re not meant to be externally accessible.


EEAT Gap Analysis Without Guesswork

def compute_eeat_gaps(content: str) -> dict[str, int]:
    return {
        "expertise": max(0, 8 - count_code_blocks(content)),
        "experience": max(0, 12 - count_specificity_markers(content)),
        "authority": max(0, 1200 - count_words(content)),
        "trust": max(0, 9 - count_caveats(content))
    }

The script calculates exact gaps:

Each gap generates a specific instruction like “Authority gap: 648 words short, expand the most technical sections with additional detail and context.” or “Trust gap: only 1 caveat found, add 8 more honest limitation notes.”


Web Research That Actually Helps

curl "http://localhost:8888/search?q=$(encode_query "$title $tags")&format=json" | jq -r '.[0:4] | .[].content'

When enabled, the optimizer:

  1. Queries SearXNG with your article title and tags
  2. Takes the top 4 snippets as fresh context
  3. Passes them to Mistral to verify facts and update references

Disable with --no-research if SearXNG is down or you want deterministic output.


Mistral’s Role: Update, Don’t Rewrite

prompt = f"""
Update this article:
{article_body}

Fix these broken links:
{broken_links}

Fill these EEAT gaps:
{eeat_feedback}

Use this web context to verify facts:
{web_context}

Rules:
- Preserve structure and voice
- Make minimal targeted edits
- Never rewrite from scratch
- Temperature: 0.2 (conservative)
"""

Mistral receives the full article body plus specific instructions. The prompt explicitly forbids rewriting, only repair, update, and optimize. Temperature 0.2 ensures conservative edits with minimal hallucination risk.


What Stays Unchanged

date: 2026-04-16
heroImage: /images/cleanup-hero.png
protected: false
tags: [linux, cleanup]

The optimizer never touches:

Protected articles (protected: true) are always skipped, even with --force.


Cron Schedule for Zero Effort

0 4 * * 0 cd /data/projects/sovereign-blog && \
python3 scripts/optimize_articles.py --eeat-below 5 --no-research >> /data/logs/optimizer.log 2>&1 && \
npm run build

Run weekly on Sunday at 4 AM:

Articles with perfect 5/5/5/5 EEAT scores are skipped automatically, making each run faster over time.


What I Actually Use

  • Mistral Small 4: Handles targeted edits without rewriting my voice
  • SearXNG: Finds fresh sources without tracking or ads
  • Astro: Builds static sites with zero runtime overhead