AIDE + Tripwire for AI Boxes: When File Integrity Matters
File-integrity monitoring is unnecessary on most workstations and necessary on the specific category of “machines that run other people’s inference.” The DGX Spark in a sovereign-AI consulting engagement is in the necessary category. Here is how to wire AIDE without producing the alert fatigue that makes most file-integrity deployments useless within six months.
Quick Take
- When file integrity matters: the machine runs customer workloads, the machine hosts the customer’s data, the customer’s contract requires audit-grade evidence of file state, or the regulatory regime (HIPAA, GDPR Art 9, FIPS) requires it.
- When it does not: development workstations, hobbyist setups, machines whose data the operator alone owns. Adding AIDE here produces noise without security benefit.
- The tool pick: AIDE for the baseline, Tripwire for the audit-grade case where a customer’s CISO requires the specific tool. Both ship the same fundamentals.
- The discipline: baseline the system in a known-good state, exclude the noisy paths explicitly, schedule the verification daily, ship the diff to a tamper-resistant log.
- The trap: initial AIDE deployment will produce thousands of diffs because the operator did not exclude the routinely-changing paths. Exclude before you alert.
When file integrity actually matters
The honest answer for most operators is “it does not, do not bother.” File-integrity monitoring (FIM) is a control designed to detect unauthorized modification of system files. The threat model is an attacker who has gained partial access to the host and is modifying files (planting backdoors, modifying configuration, replacing binaries). On a single-operator workstation, this threat is real but not high-probability, and AIDE will spend most of its life sending noise to a dashboard nobody reads.
The category where FIM crosses the bar from “nice to have” to “operationally required” is the machine that runs customer workloads. The reasons:
The customer’s contract may require audit-grade evidence that the operator did not modify the customer’s data or the inference path. AIDE produces the evidence. Without it, the operator’s word against the customer’s CISO is the only available answer, and the operator loses that contest by default.
The regulatory regime may require it. HIPAA Security Rule §164.312(c) requires “policies and procedures to protect electronic protected health information from improper alteration or destruction.” A FIM is the standard control. GDPR Article 32 has similar language about “integrity” as a security requirement. FIPS-validated environments often require FIM as part of the validation.
The threat model may justify it. A consulting engagement where the customer is in a regulated industry, has a known threat actor concerned about their data, or has a CISO who has flagged FIM as a control requirement: in these cases, the FIM is part of the deliverable.
If none of those reasons apply to your situation, skip AIDE. The tool is correct for its niche and wrong outside it.
Why AIDE rather than the alternatives
AIDE (Advanced Intrusion Detection Environment) is the standard open-source FIM for Linux systems. It ships in every major distribution’s repository, the configuration is mature, and the format of its diff output is parseable by most log-aggregation tools.
Tripwire is the commercial alternative, with a free open-source version that has fallen behind in maintenance. It is sometimes required by name in customer contracts because the customer’s compliance team has a checklist that mentions “Tripwire” as the canonical control. If your customer requires Tripwire by name, you use Tripwire; otherwise AIDE is the better-maintained choice.
Other tools (Samhain, OSSEC, Wazuh) exist and are reasonable. Wazuh in particular bundles FIM with broader SIEM functionality. For a customer engagement that requires both FIM and SIEM, Wazuh is the integrated answer; for the standalone FIM case, AIDE is simpler.
The baseline and the exclusion list
The first AIDE deployment will produce thousands of diffs the next day because Linux systems routinely modify files in ways that have nothing to do with security. The fix is the exclusion list, applied before the baseline.
The exclusion list for a Spark in a sovereign-AI consulting role:
# Don't watch the package manager's state
!/var/lib/dpkg
!/var/lib/apt
!/var/cache/apt
# Don't watch routine system state
!/var/log
!/run
!/proc
!/sys
!/tmp
# Don't watch routinely-changing user state
!/root/.bash_history
!/home/operator/.local/share
# Don't watch model and data working directories
!/data/models
!/data/hf-cache
!/data/customer-workloads
# Don't watch container engine state
!/var/lib/docker
!/var/lib/containerd
# Don't watch journald
!/var/log/journal
The list looks long because Linux is full of routinely-changing files that are not security-relevant. Without the exclusions, every daily AIDE run produces ten thousand diffs and the operator stops reading the output.
What stays included: /etc, /usr/bin, /usr/sbin, /usr/lib, /usr/local, the systemd unit directories, the SSH configuration, the AIDE configuration itself, and any application binary or configuration that is part of the customer engagement’s contract.
After the exclusion list is in place, baseline:
sudo aide --init
sudo mv /var/lib/aide/aide.db.new /var/lib/aide/aide.db
The baseline produces a database of file hashes that subsequent runs compare against.
The daily verification
A systemd timer runs the verification at 02:00 daily:
# /etc/systemd/system/aide-check.service
[Unit]
Description=Daily AIDE file integrity check
[Service]
Type=oneshot
ExecStart=/usr/local/bin/aide-check.sh
# /etc/systemd/system/aide-check.timer
[Unit]
Description=Run aide-check daily
[Timer]
OnCalendar=02:00
Persistent=true
[Install]
WantedBy=timers.target
/usr/local/bin/aide-check.sh:
#!/bin/bash
set -euo pipefail
DIFF_OUTPUT=$(aide --check 2>&1 || true)
if echo "$DIFF_OUTPUT" | grep -q "found differences"; then
echo "$DIFF_OUTPUT" | logger -t aide-check -p auth.warning
echo "$DIFF_OUTPUT" | mail -s "AIDE diff on $(hostname)" operator@sovgrid.local
fi
The diff goes both to the system journal (where it is preserved for audit) and to the operator’s email. The journal entry is timestamped, hash-chained (via journalctl’s verification), and survives even if the operator’s email is compromised.
Tamper-resistant logging
For audit-grade evidence, the AIDE database itself must be protected from modification by the attacker the FIM is supposed to detect. The pattern:
The AIDE database lives on the management host, not the Spark. The Spark sends its files-to-hash list to the management host, which computes the hashes and stores the database. An attacker who compromises the Spark cannot modify the database without also compromising the management host.
The journald output of AIDE checks ships to a tamper-evident log host (systemd-journal-remote to a host that the operator’s customer-engagement laptop holds, separate from the production stack). The chain of custody is: Spark generates the file content, management host hashes it, log host receives the audit record, operator can verify the chain at any point.
This is the kind of design that satisfies a customer’s CISO. The cost is real (an additional log-aggregation host, the operational discipline of running it), and it is appropriate only for engagements where the contract requires it.
Where this fits
For the broader security posture, see The Sovereign AI Stack in 2026. For the customer-engagement context that drives the FIM requirement, see Sovereign AI Healthcare: GDPR / HIPAA / DGX Spark.
Scope a Stack Audit for the customer-engagement case
If you are scoping a sovereign-AI engagement and the customer’s compliance team is asking about FIM, the Stack Audit covers the specific configuration that matches the customer’s regulatory regime. Reach me through the contact links in the footer of this page (Nostr DM is the fastest, the email link is HTML-entity-encoded so it survives spam scrapers).