Automate Better Blog Posts: Self-Hosted Article Optimization That Actually Works
The Optimizer Script
cd /data/projects/sovereign-blog && python3 scripts/optimize_articles.py --eeat-below 4 --no-research
This command optimizes all articles with EEAT scores below 4, skipping web research for faster runs. After completion, rebuild your site:
npm run build
The script checks links, analyzes EEAT gaps, and updates articles in place while preserving your original structure and voice.
How Link Repair Works
def check_external_links(content: str) -> list[tuple[str, int]]:
broken_links = []
for link in extract_external_links(content):
status = head_request(link)
if status in {404, 410, "timeout", "error"}:
broken_links.append((link, status))
return broken_links
The script:
- Scans your article for external links
- Sends HEAD requests to each URL
- Logs broken links (404/410/timeouts)
- Uses SearXNG to find replacements from the same domain or relevant pages
- If no replacement is found, removes the link but keeps the anchor text
Internal links (localhost, host.docker.internal) are skipped, they’re not meant to be externally accessible.
EEAT Gap Analysis Without Guesswork
def compute_eeat_gaps(content: str) -> dict[str, int]:
return {
"expertise": max(0, 8 - count_code_blocks(content)),
"experience": max(0, 12 - count_specificity_markers(content)),
"authority": max(0, 1200 - count_words(content)),
"trust": max(0, 9 - count_caveats(content))
}
The script calculates exact gaps:
- Expertise: Needs 8+ code blocks (```)
- Experience: Needs 12+ specificity markers (version strings, file paths, error outputs)
- Authority: Needs 1200+ words total
- Trust: Needs 9+ caveats/warnings/notes
Each gap generates a specific instruction like “Authority gap: 648 words short, expand the most technical sections with additional detail and context.” or “Trust gap: only 1 caveat found, add 8 more honest limitation notes.”
Web Research That Actually Helps
curl "http://localhost:8888/search?q=$(encode_query "$title $tags")&format=json" | jq -r '.[0:4] | .[].content'
When enabled, the optimizer:
- Queries SearXNG with your article title and tags
- Takes the top 4 snippets as fresh context
- Passes them to Mistral to verify facts and update references
Disable with --no-research if SearXNG is down or you want deterministic output.
Mistral’s Role: Update, Don’t Rewrite
prompt = f"""
Update this article:
{article_body}
Fix these broken links:
{broken_links}
Fill these EEAT gaps:
{eeat_feedback}
Use this web context to verify facts:
{web_context}
Rules:
- Preserve structure and voice
- Make minimal targeted edits
- Never rewrite from scratch
- Temperature: 0.2 (conservative)
"""
Mistral receives the full article body plus specific instructions. The prompt explicitly forbids rewriting, only repair, update, and optimize. Temperature 0.2 ensures conservative edits with minimal hallucination risk.
What Stays Unchanged
date: 2026-04-16
heroImage: /images/cleanup-hero.png
protected: false
tags: [linux, cleanup]
The optimizer never touches:
- Publish dates
- Images
- Featured status
- Protection flags
- Tags
- Affiliate links
- Diagram references
Protected articles (protected: true) are always skipped, even with --force.
Cron Schedule for Zero Effort
0 4 * * 0 cd /data/projects/sovereign-blog && \
python3 scripts/optimize_articles.py --eeat-below 5 --no-research >> /data/logs/optimizer.log 2>&1 && \
npm run build
Run weekly on Sunday at 4 AM:
- After the main pipeline publishes new content
- When Mistral is less busy
- With
--no-researchfor speed - Logs output for debugging
Articles with perfect 5/5/5/5 EEAT scores are skipped automatically, making each run faster over time.
What I Actually Use
- Mistral Small 4: Handles targeted edits without rewriting my voice
- SearXNG: Finds fresh sources without tracking or ads
- Astro: Builds static sites with zero runtime overhead