Gitea as Source-of-Truth for AI Pipelines

May 20, 2026 10 min read

A self-hosted Gitea instance holds the prompts, the systemd unit files, the runbooks, the customer-data references, and the model identifiers for the sovgrid AI stack. The pattern is mundane: put everything in Git, run Gitea on your own hardware, and the AI pipeline becomes auditable and reproducible.

Quick Text

What goes in Gitea: every prompt, every config file, every systemd unit, every runbook, every dispatcher rule, every customer-specific deployment manifest. If a future operator needs to reconstruct the system, the Git repo has everything.

What does not go in Gitea: model weights (huge, public-reproducible), container images (rebuildable from Dockerfiles), secrets (separate secret-management).

Why self-hosted: GitHub is rented sovereignty. A Gitea instance under your control is the sovereign equivalent.

The integration pattern: the AI services read their configuration from local files that are kept in sync with Gitea via a regular pull. No live network call to the Git server during inference.

The cost: running Gitea is real operational work (database, backups, version updates). The benefit is the audit trail and the disaster-recovery story.

What goes into Git

Everything that the AI pipeline depends on that is not public-reproducible.

Prompts. The system prompts, the user-facing prompts, the few-shot examples that anchor the model’s behavior. Every prompt change is a commit; every commit has a message explaining the change; every prompt is reviewable in git log. (For the broader prompt-as-code argument, this is the operational version of the same principle.)

Configuration files. The vLLM startup flags, the SGLang environment variables, the dispatcher routing rules. Every change is a commit; every deployment is a known commit hash. The systemd unit files are in Git, the Caddyfile is in Git, and the Cloudflare Tunnel config is in the history: its 2026-05-24 retirement was itself a commit, which is exactly the point.

Systemd unit files. Each long-running service has a .service file in the repository. The unit files have the patterns from systemd Patterns for Self-Hosted AI Services. When the file changes, the change is reviewable; when the production system needs to be reconstructed, the unit files are in the repo.

Runbooks. The recovery procedures, the failure-mode catalogs, the disaster-response checklists. Every runbook is a markdown file in /runbooks/ of the repo. The runbook from Power Failure Recovery on a DGX Spark is one of these.

Model identifiers. Not the model weights, but the canonical identifier strings (Hugging Face repo names, commit hashes, quantization variants) that uniquely identify which weights to download. The identifiers are small text; the weights they refer to are large but reproducible.

Customer-specific deployment manifests. Per-customer configuration, the specific routing rules for that customer’s data, the systemd overrides if the customer has tuning requirements. Each customer’s manifest is in a separate directory; the directory is private to the customer’s engagement.

What does not go into Git

Model weights. Too large, public-reproducible from Hugging Face. The Git repo references the identifier; the weights are downloaded on demand.

Container images. Reproducible from Dockerfile plus the upstream image layers. Store the Dockerfile in Git, not the built image.

Secrets. Lightning seed phrases, API tokens, TLS private keys. These belong in a separate secret-management mechanism (a hardware wallet for the Lightning seed, systemd LoadCredential= for service tokens, an offline-stored file for the TLS keys). Putting secrets in Git, even an encrypted Git repo, is the kind of mistake that produces incidents later.

Customer data. PII, audio recordings, source documents. These have separate compliance handling and do not belong in the operational source-control. (See Sovereign AI Healthcare: GDPR / HIPAA / DGX Spark, publication pending, for the customer-data-handling patterns.)

Why webhook-driven integration beats polling

The pull-sync timer described above is polling: it checks for updates every 15 minutes regardless of whether anything changed. That is acceptable for the current stack size, which is why I have kept the timer pattern rather than replacing it.

The upgrade path is webhook-driven integration. Gitea supports outbound webhooks on push events. Instead of polling on a fixed interval, the Spark receives an HTTP POST from Gitea the moment a push lands. The AI service can reload its configuration within seconds rather than within 15 minutes.

I have not deployed the webhook pattern yet, because the 15-minute lag is not a problem for this stack today (tested this in April 2026 with the prompt-tweak workflow). The operational cost of running a webhook receiver that is reachable from Floki^{₿Affiliate link. You support sovgrid at no extra cost to you. See /support.} ^↗ to the Spark adds a network dependency that the polling approach avoids. This prevents a class of failure: if the Spark’s webhook listener is down, a pushed config change silently fails to propagate. With polling, the worst case is a 15-minute delay, not a silent miss.

The comparison between the two approaches: polling is simpler, self-healing (the timer fires independently of the push path), and correct for low-frequency config changes. Webhook-driven integration is lower-latency and more efficient, but requires an additional inbound network path and a reliable listener. For AI pipeline config that changes at most a few times per day rather than hundreds of times, polling wins on simplicity.

Why self-hosted Gitea rather than GitHub

GitHub is convenient and works. For sovgrid, GitHub is rented sovereignty. The repository content is held by a third party (Microsoft, indirectly), the third party has the option to suspend or delete the account, and the third party’s terms of service can change.

Self-hosted Gitea moves the source-control plane onto the operator’s own infrastructure. The Gitea instance runs on the Floki VPS, alongside Caddy and the MCP server. The cost is roughly 200 MB of memory, a small SQLite database, and the operational work of keeping Gitea patched.

The benefit is sovereignty on the source-control dimension. The Gitea instance does not depend on any third party for its operation. If Microsoft changes its mind about sovgrid, the operation continues unaffected. If the Floki VPS goes away, the Git history is also backed up to the Spark via routine pull, so the disaster-recovery path exists.

The Gitea instance also serves as the development-time CI runner for the blog deploy pipeline. The same self-hosted infrastructure handles source control and continuous integration; there is no GitHub Actions equivalent rental.

For the broader Gitea setup, see Setup: Gitea Setup and the integration with OpenHands at Fixes: OpenHands Gitea Integration.

The integration pattern

The AI services do not talk to Gitea during inference. Reading the Git server on every inference call would introduce latency, a network dependency on the Floki VPS, and a coupling between the inference path and the source-control system.

Instead, a git pull job on the Spark runs every 15 minutes (via systemd timer). The pull updates the local working copy of the relevant repository. The AI services read their configuration from the local working copy, not from Gitea directly.

The pattern means:

Inference is local; no network call for config.
Configuration changes propagate within 15 minutes of being pushed.
If Gitea is unreachable, the inference services keep running with the last-known-good configuration.
If the operator’s laptop is unavailable, the production stack still runs on whatever was last pulled.

The 15-minute lag is acceptable for the kind of changes that go through Gitea (prompt tweaks, runbook updates, configuration adjustments). Time-sensitive changes that need to apply immediately use systemctl reload on the relevant service after pushing, with a git pull triggered manually.

Setting up the pull sync: concrete steps

This is what the pull-sync setup looks like on the Spark. I set this up in April 2026 when the stack grew past three repos and manual pulls became error-prone.

Clone the repo once into the working location: git clone git@floki:sovgrid/sovereign-ops.git /data/projects/sovereign-ops
Create /etc/systemd/system/gitea-pull-sovereign-ops.service with the unit content below.
Create the matching .timer unit that fires every 15 minutes.
Run systemctl enable --now gitea-pull-sovereign-ops.timer to activate.

The service file (stored in /data/projects/sovereign-ops/systemd/gitea-pull-sovereign-ops.service):

[Unit]
Description=Pull sovereign-ops config from Gitea
After=network-online.target

[Service]
Type=oneshot
User=cipherfox
WorkingDirectory=/data/projects/sovereign-ops
ExecStart=/usr/bin/git pull --ff-only origin main
StandardOutput=journal
StandardError=journal

The AGENTS.md pattern for multi-agent sessions mirrors this: every agent session begins with an explicit pull to avoid acting on stale config. The pattern in AGENTS.md is three lines:

# At session start -- pull before any read or write
cd /data/projects/sovereign-ops
git pull --ff-only origin main

This prevents the class of bug where two concurrent sessions diverge because each read a different version of the same config file.

Gitea version and deployment config

As of May 2026, the Gitea instance runs gitea/gitea:1.23.1 on Floki. The docker-compose.yml for the instance:

services:
  gitea:
    image: gitea/gitea:1.23.1
    restart: unless-stopped
    ports:
      - "127.0.0.1:3002:3000"
    volumes:
      - /home/cipherfox/gitea-data:/data
    environment:
      - GITEA__database__DB_TYPE=sqlite3
      - GITEA__server__ROOT_URL=https://git.sovgrid.org

The 127.0.0.1:3002 binding is intentional: Gitea is not exposed directly to the internet. Caddy reverse-proxies the public domain git.sovgrid.org to localhost:3002. This is the loopback policy that applies to every service in the stack (as described in the Floki hardening notes). Port 3002 is an internal-only binding, not a public service.

The gitea/gitea:1.23.1 image tag is pinned, not latest. Watchtower is disabled for this container via the com.centurylinklabs.watchtower.enable=false label, introduced in May 2026 when the same auto-update policy that created the vLLM restart cycle was applied to infrastructure containers. Gitea updates are manual and deliberate.

Where Gitea is the wrong choice

Three situations where self-hosted Gitea is not the right answer, because I have seen the failure modes directly.

Single-person teams where GitHub’s network effects matter. Gitea replicates the Git protocol but not the ecosystem: no Dependabot, no Actions marketplace, no Copilot integration, no security advisories feed. For a project that depends on the open-source contribution graph, GitHub’s network effects outweigh the sovereignty benefit. Gitea is correct for sovereign infrastructure, not for catching community patches.

Situations where operational cost is not acceptable. Gitea requires a database, a backup procedure, a patching schedule, and someone who handles the “Gitea won’t start after a kernel update” incident at 2 AM. For a single developer with no ops background, this is real overhead, not theoretical. The 200 MB memory footprint is not the cost; the cognitive load of owning another service is. If that load is not acceptable, a private GitHub organization is a reasonable trade-off. The sovereignty cost is real but so is the operational cost.

Environments where the Git server must federate with a corporate identity provider. Gitea v1.23.1 has LDAP and OIDC support, tested it in April 2026 against a test instance, and it works, but the integration surface is complex. A corporate GitLab or GitHub Enterprise with existing SSO integration is far less friction than standing up Gitea’s OIDC config from scratch. Don’t replicate solved problems.

The customer-engagement variant

For customer engagements, the Gitea pattern adapts to the customer’s source-control policy. Three configurations.

Customer uses our Gitea. The customer’s engagement has a private repository in our Gitea instance. The customer has read access via SSH key. This is the simplest pattern and works when the customer accepts the rented-from-us dimension.

Customer uses their own Git. The customer has a corporate GitLab, GitHub Enterprise, or other Git server. The engagement-specific repository lives there. Our Spark pulls from the customer’s Git via SSH-key authentication. The customer’s Git server is the source of truth for that engagement’s artifacts.

Customer requires air-gapped. No network connection between our Git and the customer’s deployment. The artifacts are physically transferred (USB key, signed tarball, signed-and-encrypted email). The operational overhead is high; some defense and finance engagements require this anyway.

Where this fits

For the broader reference architecture, see The Sovereign AI Stack in 2026. For the systemd unit files that the Git repository holds, see systemd Patterns for Self-Hosted AI Services. For the broader Gitea operational setup, see Setup: Gitea Setup.

A future article publishes the template repository structure for customer engagements, including the directory layout, the artifact-classification rules, and the customer-handoff procedure at engagement end. Subscribe via the footer.

	Today	7d	30d	All-time
Unique readers	—	—	—	—
Page views	—	—	—	—