Sovereign Friend-Setup: When You Build A Sovereign-AI Box For Someone Else

June 1, 2026 23 min read

I spent a recent day building a sovereign-AI setup on a Lenovo Legion for a friend who had never run Linux. The day-of mechanics are covered in the 24-hour setup-log post. This post is about the design decision behind those mechanics: what changes when the operator of a sovereign-AI box is not the same person as the daily user?

The answer surprised me, mostly in how much had to change at every layer. Most sovereign-AI write-ups assume the same person sets the system up, maintains it, and uses it. The decisions are made with one identity, one threat model, one mental model of how a given component works. When the operator is different from the daily user, almost every default that “just makes sense” for the operator becomes wrong for the user, often in ways the user will not notice for months.

This post is the catalogue of those defaults and the changes the friend-setup forces.

What the friend is not

The friend is not me. He is not a Linux operator. He has no engineering memory of how Tailscale, MCP, or Ollama got into his life. He has not chosen the stack. He has been given a working machine and a TODO list.

That framing is the entire post. Everything that follows is a consequence of it.

The default sovereign-AI write-up that goes “install Ollama, pull qwen3, point your terminal at it” assumes the reader has a terminal habit. The friend does not. The default write-up that goes “configure your Tailscale ACL to scope the friend’s access” assumes the friend is a guest in the operator’s tailnet. He is not, or rather should not be, for reasons I will get to. The default write-up that assumes the same hands type ollama pull and later type ollama prompt is wrong about which hands type the second one.

The identity-separation discipline

The first decision in friend-setup, and the one that propagates farthest, is that the friend gets his own identity at every layer where identity exists.

His own Tailscale account, under his own SSO provider, with his own free-tier tailnet that has nothing to do with mine.

His own GitHub account, separately created with his own email, that signs his Tailscale and any other SSO-driven services. Not a fork of my GitHub.

His own Bitwarden vault, on his own machine, with his own master password. Not a shared vault with me.

His own OpenWebUI admin account, registered the first time he opens the web UI, with credentials only he knows. The friend’s email becomes his GitHub email becomes his Tailscale email becomes his OpenWebUI email, and the four are owned by him at every layer.

The temptation, the first time you set this up, is to make the friend a sub-user of your own infrastructure. Invite him to your tailnet. Add him as a collaborator on your GitHub. Give him an account on your OpenWebUI. Each of these is faster to set up, and each of these creates a slow-leaking failure mode that you will not notice for months.

If the friend is in your tailnet, his machines are in your admin console. You can see his hostnames, his IP assignments, his connection logs. The relationship is asymmetric in a way he probably does not understand at the point he accepted the invitation. If he ever leaves your social orbit, his VPN goes with him. If you lose access to your tailnet for any reason that has nothing to do with him (account suspended, SSO provider acquired, identity stolen), his ability to reach his own infrastructure breaks.

The same logic applies to every other layer. Identity-separation is not a privacy preference. It is a structural property of an arrangement that has to survive both parties’ relationships changing over time.

Identity-separation does not mean the friend cannot use my server. The Tailscale node-sharing primitive solves the access question cleanly without violating identity-separation.

Mechanism: I share one specific node (the DGX Spark in my apartment) to his tailnet by inviting his email address. He accepts. The node appears in his admin console as shared from cipherfoxie@github. He can reach it by its 100.x.x.x IP exactly as if it were a node in his own tailnet. His tailnet remains his.

Then I do the same in reverse: he shares his laptop back to my tailnet so that I can SSH-in for support when he asks for it. Two tailnets, two shared nodes, four-way symmetry, no shared identity.

The ACL discipline matters. By default a shared node is wide-open: the recipient tailnet can reach any port on it. The right thing is to scope the share to exactly the service the recipient should be able to use. In my case, the friend should only be able to reach the vLLM endpoint on the Spark (port 30001), not the dashboard (8443), not the SSH daemon (22), not the MCP bridge (8770). One ACL line restricts him to one port. I tested it: from his laptop, a port scan against my Spark’s tailnet IP returns one open port and seven blocked.

This pattern generalizes. The two-tailnet privacy post covers it in depth, including the exact ACL JSON. The pattern works for any family-sysadmin case: a parent who wants to access your self-hosted file backup, a partner who wants to use your Lightning node, a sibling who wants to chat against your local LLM. Each one gets their own tailnet and one scoped share. Nobody is a guest in anyone else’s tailnet.

Default-local at the application layer

Identity-separation gets the network layer right. There is a second layer that matters just as much: application-layer defaults.

When the friend opens OpenWebUI, the dropdown of available models includes both the local Ollama models and the shared remote DGX Spark Qwen-3.6. The temptation is to make the larger DGX Spark model the default because it produces higher-quality output. The right thing is to make the local model the default because it leaks nothing across the network boundary.

Privacy-by-default at the application layer means: the friend has to make a deliberate choice to send a query across the network. If he asks a casual question, the local model answers. If he wants the bigger model because the local one missed something, he clicks the dropdown, picks DGX Spark Qwen-3.6, and at that moment he has consented to the metadata that crosses the boundary.

The configuration for this is one line in the OpenWebUI compose file: DEFAULT_MODELS=qwen3:8b. The first model in the comma-separated list is the default the friend sees on every new chat. The shared model is in the list but not first.

There is a description discipline that goes with this. The DGX Spark Qwen-3.6 entry in the OpenWebUI model list includes a warning paragraph: “This model runs on someone else’s server. He can see metadata about your calls (timestamp, source IP, token counts) but not the content (prompts and responses are not logged at INFO level). If your question is private, prefer the local model.” The warning is the consent mechanism. Without it, the friend cannot meaningfully consent because he does not know what he is consenting to.

The honest description of what the operator sees, in this case me, matters. I see metadata. I see when the friend made a request. I see the source IP, which tells me the request came from his laptop. I see the prompt and completion token counts. I do not see the prompt body or the response body, because vLLM logs at INFO level and the bodies are only at DEBUG. If I wanted stronger privacy than this, the next step would be to add --disable-log-requests to the vLLM service args and lose the metadata too. I have not done that yet. The metadata is useful for debugging load issues and the friend knows this and the cost-benefit lands on “keep metadata, document it.” That is a defensible decision. The undefensible decision would have been to not document it.

The persona pattern

Models with personas teach faster than models without personas. This was not obvious before I built it.

The friend’s OpenWebUI ships with five models pre-configured: Qwen3 8B, Mistral 7B, Llama 3.1 8B, DGX Spark Qwen-3.6, and a custom Persona model called “Grill-Me.” Each model description includes a section called Persona that says, in plain prose, what character the model has been tuned for and how that character differs from the other models on the list.

Qwen3 is described as the precise technician: structured answers, good code output, direct style. Mistral is described as the mediterranean one: relaxed clarity, occasional cultural resonance, no filler phrases. Llama is described as the long-form analyst: structured answers with headings, willing to develop context. DGX Spark is the bigger version of the technician, with privacy caveats. Grill-Me is the skeptic: hard feedback instead of politeness, devils-advocate posture, no warmth.

The cross-references in those descriptions are deliberate. Qwen’s description says “unlike Mistral (mediterranean-gelassen), Llama (long-form analyst), and Grill-Me (skeptisch-hart).” Mistral’s description says “unlike Qwen (presize technical), Llama (long analyses), and Grill-Me (sharp skeptic).” The cross-references calibrate expectations faster than any single description can.

The friend recognizes a Persona he wants to talk to before he understands the spec sheet. He learns model behavior by talking to one and then talking to another. The cross-references mean he can predict the difference instead of being surprised by it. The cost of this in build time was ten minutes of writing five descriptions. The payoff is that the friend explores all five models in the first hour instead of using one of them for three weeks.

There is also a discipline note in every Persona description: a one-line Anpassbar (“customizable”) hint that says the default is just an example, the system prompt can be changed, and the friend can build his own Persona from scratch. The discipline note is load-bearing. Without it, the friend treats the defaults as canonical. With it, the friend treats them as starting points.

The Vibe-Sustaining TODO file

The hardest part of friend-setup is not the technical configuration. It is the design of the friend’s first day.

A working sovereign-AI box, handed to someone who has never run Linux, with a complete welcome document and a thorough cheatsheet, will go unused for weeks if the first session does not produce a result that the friend wants to show someone. The friend will spend two hours reading documentation, conclude that this is interesting but not for him, and close the laptop. Three days later he will fall back to ChatGPT.

The way around this is to put the first concrete result in the first 30 minutes, before the friend has formed an opinion about whether sovereign-AI is for him.

The TODO file the friend opens first is structured around this. Block 0 contains three items, each of which takes about 10 minutes and produces a tangible artifact: generate your first AI image in ComfyUI, ask Mistral a question in the voice of a relaxed Italian, ask the KI to describe the image. The first two work. The third deliberately fails because all four local models are text-only after quantization. The failure message explains the constraint: vision was dropped during the quantization step, here is the workaround. Within 30 minutes the friend has an artifact and an accurate mental model of one limitation.

Block A is Vision-Quest. Five items, 15 minutes each. The friend uses Mistral to talk through what he actually wants to make. Videos, image series, written essays, a book, an app, GitHub bounty-hunting, or something else. Each item produces a note in the KB. The KB grows during the Vision-Quest, which means the friend has both a destination and a record of how he got there.

Block B is “experiment with the tools, one weekend each.” Block C is “your first real piece, published somewhere.” Block D is “make money, if you want.” Block E is “maintain the system, lightly.”

The Achievements track at the bottom of the file is gamified on purpose. Every checkbox the friend marks is also a row in an Achievements list: first AI image generated, first KI conversation, first Vision Pitch written, first Plan filed, first real piece published, first bounty won. The Achievements list is the social-media-style retention pattern transferred to a TODO file. The friend can see his progress as a series of unlocks.

The pattern came from watching myself stay engaged for 24 consecutive hours on this setup. I wanted to know what kept me going. The answer was that every two hours produced an artifact that I could point to: the dashboard worked, the benchmark ran, the cross-tailnet ACL passed the port-scan test. Each artifact validated the previous two hours of work. The TODO file for the friend is engineered to produce the same artifact-cadence in his first day.

The dashboard as a teaching surface

The dashboard on the friend’s laptop is not a status board. It is a teaching surface.

The longer post on this covers the design pattern in detail. The short version: every metric, every container, every audit-check has an info-button that opens a side-drawer with five sections (what is this, why does it matter, pros and cons, best practice, CLI to verify yourself). Every model description has the Persona-and-Anpassbar pattern. There is a Lernen tab that contains a curated stack-tour and a Glossary that explains every acronym the dashboard uses. There is a Doktor tab that runs concrete audit-checks and includes one-button fixes for the ones that have an obvious remediation.

The intended user behavior is: when the friend wonders what something means, he taps the info-button next to the thing. He gets the five-section explanation without leaving the dashboard. He reads it in 90 seconds. He goes back to what he was doing with one more concept internalized.

The contrasting design, status-board-as-museum, expects the user to know what every metric means or to go elsewhere to find out. The friend will not go elsewhere. He will either know what it means or assume it means nothing and move on. The status-board pattern accepts the second outcome. The teaching-surface pattern rejects it.

The honest privacy caveat

Identity-separation, default-local models, ACL-scoped node-sharing, and warning paragraphs in model descriptions add up to “the friend does not leak anything by accident.” They do not add up to “the friend leaks nothing.”

If the friend uses the shared DGX Spark Qwen model, I see request metadata. If the friend writes a KB note that he later asks the local model about, the local model’s context window briefly contains that note. If the friend’s laptop is ever physically stolen, the disk is LUKS-encrypted but the Windows partition next to it is not currently re-encrypted with BitLocker (the friend’s choice, documented in a TODO item, with the trade-off written out).

The honest way to frame these caveats is to enumerate them in the welcome document and let the friend decide which ones he wants to close. This is not a one-time decision. The threat model changes as the friend’s use of the system changes. If he starts writing about something sensitive, the default-local discipline matters more than it did when he was just generating AI images. The welcome document does not pretend to make those decisions for him. It explains what each layer protects and what each layer leaks, and the friend can revisit the trade-offs as his use changes.

The opposite of this, the welcome document that says “this is a sovereign-AI setup, your privacy is protected,” would be marketing. The friend would believe it for six months and then discover, in the worst possible way, that “protected” meant “protected from some things but not others.” Engineering honesty applies to friend-setup the same way it applies to my own setup.

What I tell other operators

People who are thinking about doing the same thing for a partner, parent, or sibling ask me what the biggest gotcha is. The answer is not technical.

The biggest gotcha is the assumption that “they will figure it out from the docs.” They will not. They will use the system for the things they figure out in the first day and ignore the rest forever. If the first day does not include the AI image, the KI conversation, and the Vision-Quest, those capabilities effectively do not exist for that user.

The dashboard, the TODO file, the Personas, the welcome document, and the first-30-minutes structure are not nice-to-haves. They are the difference between a sovereign-AI setup that the friend uses and a 3500-EUR paperweight.

If you are building this for someone, write the first day before you write the cheat-sheet. Make Block 0 produce an artifact. Use the artifact in the friend’s first conversation about what he wants to make. Build the rest of the friend’s relationship with the system from there.

The hard work of friend-setup is not the technical configuration. It is the empathy work of figuring out what the friend’s first three hours feel like, then engineering toward “engaged” instead of “overwhelmed.” The technical configuration is the easy part once that question is answered.

What I learned later (2026-05-30 update)

The day after I drafted this post I produced a small case study in the failure mode this concept-piece is supposed to prevent. I include it because friend-setup is the context where the failure mode is the most expensive.

Two “fixes” on day two, both installed without root-cause analysis, both wrong. I had noticed two annoyances in the friend’s setup on the morning of day two. The boot text looked unfriendly, so I enabled Plymouth for a polished splash screen. The internal speakers sounded tinny, so I installed EasyEffects with a preset that bumped the high-mids. Both took about ten minutes. Both shipped without me first asking “what is the actual root cause.” Both broke things, the Plymouth one catastrophically.

The Plymouth enablement broke the LUKS passphrase prompt on the next reboot. Plymouth on Blackwell does not pass keyboard input through during early boot, so the friend stared at a blinking cursor that did not echo his typing. The EasyEffects install did not fix the speakers because the speakers are a hardware-driver problem (the Smart-Amp on this Lenovo sub-variant has no Linux driver yet), so EasyEffects added a processing layer on top of a broken substrate. The real fix is Bluetooth headphones. The PipeWire pipeline I had installed was sophistication around an unsolved problem.

The discipline lesson for friend-setup specifically: the cost of an unverified “fix” lands on the friend, not on me. When I install something on my own machine that turns out to be wrong, I notice the problem within the hour and I undo it. When I install something on the friend’s machine, the wrongness propagates into his daily-driver experience. He boots and gets locked out of his own disk. He plays music and hears the same tinny sound through a more elaborate processing chain. The asymmetry is the entire reason the discipline matters more here than on my own setup. Friend-setup raises the cost of speed-over-verification, and the right response is to slow down, not to ship faster.

The new pre-install rule, written into the friend’s TODO file as Block-E maintenance discipline: before I install any “fix” on his machine, I have to write down what the root cause is, how I confirmed it, and what the rollback procedure is if the fix breaks something else. The rule is two minutes of pre-install thinking that catches the class of mistake I made yesterday. I should have had this rule from day one. The receipts for why I now have it are above.

What is on this thread

The sibling posts go deeper on specific decisions:

24 Hours Setting Up a Lenovo Legion Pro 7 Gen 10 is the day-of mechanics post, with all the mistakes.

We Were Wrong About Local 8B Tool-Use corrects an earlier memo and is the technical underpinning of why the local-first defaults work in 2026.

The /data/ Convention Trap on Standard Ubuntu LVM is the post-mortem on the storage layout mistake the friend will never see because I fixed it before he booted.

Dashboard As Learning-Cockpit Not Admin-Tool is the UX pattern for the teaching-surface dashboard.

Two Tailnets, One Shared Node is the privacy-primitive post that covers the cross-tailnet sharing pattern in detail.

The thread is open-ended. I will write more posts as the friend’s setup ages and produces new lessons. The most interesting follow-up will be three months in, when I have data on what he actually used.

Whether You Build It Yourself Or Buy It

A reader who has gotten this far is probably in one of two camps. Either you are a Linux-fluent operator weighing whether to build something like this for a person you care about, or you are the prospective recipient of such a setup who has noticed you want one but you do not have 24 hours and five years of Linux experience lying around. The honest case for each camp is different, and worth writing out.

For The Operator Who Wants To Build It Themselves

The build is reproducible. The 24-hour log post catalogues the steps and the traps. Everything I used is open-source and obtainable: Ubuntu LTS, Ollama, OpenWebUI, ComfyUI, Tailscale on the free tier, Gitea, a dashboard pattern that is two FastAPI files and one HTML file. The Persona descriptions are five paragraphs of prose. The Block-0 TODO is one Markdown file. There is no commercial component you cannot replicate.

The hidden cost is the catalogue of mistakes the experienced operator has already paid for and forgotten about. I documented mine: the TPM-PIN pre-flight, the LVM-on-LUKS installer-path that no longer works in Ubuntu 26.04, the Docker root-partition trap that hits at 96 percent disk usage, the Plymouth-on-Blackwell lockout, the EasyEffects fix that fixes nothing, the Brave-app-mode profiles that lose Bitwarden. Reading the catalogue does not transfer the muscle memory. You will pay for some fraction of those mistakes yourself on first build, and a different fraction on second build. Budget eight to twelve hours of focused work if you have the prerequisites (Linux fluency, prior experience with LUKS, working understanding of Tailscale and Docker), and budget twenty to forty hours if you do not.

The non-technical part is the harder part. Designing the friend’s first 30 minutes so it produces an artifact before forming an opinion is the work that distinguishes a sovereign-AI box that gets used from one that becomes a 3500-EUR paperweight in week three. That work is not in the code. It is in the empathy exercise of figuring out what the friend will feel during his first three hours. Spend more time on the welcome document, the Block-0 TODO, the Persona descriptions, and the Lernen-tab content than you spend on the dashboard backend. The backend is half a day. The teaching surface is two days. Most operators get the ratio inverted.

If you build this for somebody, write the first day before you write the cheat-sheet. Make Block 0 produce an artifact that the friend wants to show somebody. Use the artifact as the seed for the Vision-Quest conversation in Block A. Build the rest of the relationship between the friend and the system from there. The technical configuration is the easy part once that question is answered.

For The Buyer Who Does Not Want To Build It Themselves

If you have read both this post and the 24-hour log and you have noticed that you want one of these but you do not want to build it, the honest market math is this.

A ready-to-use sovereign-AI workstation as a service, delivered with the patterns from this post (hardware, OS install, Ollama with four local models, OpenWebUI with Personas, custom learning-cockpit dashboard, MCP-routed RAG, optional cross-tailnet share to a backbone box, Block-0 onboarding TODO, 30 days of post-delivery support) costs somewhere between 3900 and 6100 EUR in 2026 from a competent craftsman. The hardware is 2,600 to 3,400 EUR for a current-generation Linux-ready laptop. The skilled labor (eight to twelve hours at a commercial Linux-engineering rate of 80 to 150 EUR per hour) is 640 to 1800 EUR. The toolchain plus dashboard plus onboarding pattern is 500 to 1200 EUR of delivered value. The 30-day support window is 300 to 700 EUR.

There is no direct competitor offering this today. System76 sells Linux laptops with Pop!_OS at 2500 to 4500 EUR without an AI stack. Tuxedo, Slimbook, and Framework sell generic Linux laptops in the 1500 to 3500 EUR range, also without an AI stack. NVIDIA’s DGX Spark at 4000 EUR is an AI-development workstation, not a daily-user box, and does not ship with a learning-cockpit dashboard or an onboarding flow. The niche of “Linux laptop plus sovereign-AI tool-chain ready-to-use” is empty in 2026.

The buyer demographic where this purchase makes obvious sense is narrow but well-defined. Privacy-conscious professionals (Anwälte, Therapeuten, Journalisten) whose professional duty makes cloud-AI legally fraught. Senior developers who would rather not feed their proprietary codebase into someone else’s training pipeline. Crypto and sovereign-stack enthusiasts who already self-host other infrastructure. Tech-aware parents who do not want their children’s chat history to land in a US-hosted training corpus. Researchers handling confidential data. Indie creators (writers, podcasters, video editors) whose drafts are the product and should not be in a training set.

The buyer in any of these categories who values privacy and sovereignty over convenience is the buyer for whom the math works.

The buyer for whom the math does not work is the one whose binding constraint is monthly cost (a ChatGPT Plus subscription on a cheap laptop is cheaper for the first 16 to 25 months) or whose binding constraint is ease of use (an Apple M4 with cloud AI is easier on day one and forever after). Those are honest constraints. If they describe you, this is not the right purchase for you.

What the buyer is paying for, in plain prose, is eight to twelve hours of skilled engineering work that they would not otherwise finish, plus a catalogue of avoided mistakes the operator has already paid for, plus a non-trivial network-architecture primitive (the cross-tailnet privacy-mesh), plus a teaching-surface dashboard and onboarding flow that addresses the “what do I even do with this” friction that kills most home-built systems, plus a 30-day window where the inevitable first-month failure (a kernel update lands sideways, a container refuses to restart, a Tailscale ACL drifts after a UI redesign) gets fixed for them. The buyer gets a working sovereign stack on day one, not a kit.

The honest framing of what this purchase is not: it is not the cheapest path to AI assistance, and it is not the easiest path. It is the right path for the operator who values privacy and sovereignty enough to pay for the build instead of doing the build themselves, or instead of accepting the cloud-AI default.

A Note On Where I Stand

I am writing this section honestly because the question came up the day after I shipped the friend’s machine. I built this for a friend at zero charge. The question of what it would cost commercially is a market-reflection exercise, not a sales pitch. I am not currently offering this as a service. I might at some point, in a specific vertical (probably professional services in Germany or German-speaking Europe, where the privacy-regulatory environment is moving in the direction that makes this purchase rational). Or I might not. The number is honest either way, because the work is real, the buyer demographic is identifiable, and the niche is empty. Somebody will offer this commercially in 2026 or 2027 because the demand is bounded but real and the construction cost is known.

If that somebody is you, the post above is most of what you need to do it. If that somebody is going to be you only on the buying side, the price range is where the market will land.

	Today	7d	30d	All-time
Unique readers	—	—	—	—
Page views	—	—	—	—