Receiving Stolen Goods at 60 Tokens a Second
The honest title for this essay is the confession in it. The open weights on my disk, an Apache-2.0 Qwen checkpoint I run end to end on hardware I own, were trained on a corpus I did not assemble and cannot fully audit. Some of what went into models of this generation was taken. Pirated books. Labor bought at a price that would be illegal in most of the rooms where the resulting models are now demoed. When I load that checkpoint and watch it produce roughly 60 tokens a second with no data leaving the building, I am doing something genuinely better than renting. I am also, partly, receiving stolen goods.
I have spent seven essays arguing that self-hosting is a defensible posture. This is the one where I have to admit what it does not fix, because a sovereignty blog that only counts the wins is just marketing with a Latin word in it, and the honesty is supposed to be the product. The work of this essay is to hold two true things at once without letting either one cancel the other.
Using the tool is the extraction point
Nick Couldry and Ulises Mejias gave the cleanest name for what a rented model does to you. They call it data colonialism: a new social order in which human life is appropriated as raw material for extraction, the way land and bodies were appropriated under historical colonialism. The move they insist on is that the extraction is not a side effect of using the tool. Using the tool is the extraction point. You do not pay with money and then, separately and regrettably, leak some data. The data relation is the product. Every prompt you send to a frontier API is, in their frame, an act of dispossession dressed as a feature, because the thing you typed becomes the provider’s surplus the instant it crosses the wire.
I take their frame with one caveat I have made before and will keep making: a paid API is also a transaction, not pure conquest, and flattening every commercial exchange into colonialism cheapens the word for the cases that earn it. But the structural worry survives the caveat. At the API boundary the relation runs one way. You become legible to the provider, your prompts logged and rate-limited and available as training material, while the provider stays opaque to you. That asymmetry is the thing self-hosting actually breaks. So far, so good for my side of the argument. The trouble starts when you ask what the weights were made of before they ever reached my disk.
What self-hosting genuinely stops
Let me be precise about the win, because it is real and I am not going to undersell it to look humble.
When I run a model locally, my prompts no longer become anyone’s training surplus. The keystrokes of my working day, the half-formed questions, the client material, the things I would never type into a box owned by a third party, stay on a machine in a room I control with zero inference egress. The data relation that Couldry and Mejias describe is severed at the point of use. There is no wire for the surplus to cross.
The second win is quieter and matters as much. I add no marginal RLHF labor. Every interaction with a hosted assistant is, potentially, a free annotation: your thumbs-up, your retry, your correction is signal the provider can harvest to tune the next version, work that someone is otherwise paid to do. Run the model offline and you stop being an unpaid annotator. You opt out of the human-feedback pipeline going forward, not as a gesture but mechanically, because there is no telemetry channel back to the lab.
This is what the forward column of the table at the top is. From the moment I stop renting, the extraction machine gets nothing more from me. No prompts, no feedback, no behavioral surplus, no marginal labor. That refusal is not theatre. It is a measurable change in what flows out of my work, and the measure is zero. I have argued elsewhere that the rented API became a radical monopoly that manufactures the need it then meters; the data extraction is the same monopoly seen from the supply side, and self-hosting cuts the supply line. Going forward.
Those last two words are where the essay turns.
What it cannot touch
Here is what the open weights already contain, and what running them locally does precisely nothing to remediate.
In September 2025, the settlement in Bartz v. Anthropic put a number on the harm. As reported by NPR, the case drew a line that matters: training on books was treated as fair use, but the piracy of more than 7 million books to build the training corpus was not, and the company agreed to a settlement reported at about $1.5 billion. Read that division carefully, because it is the whole problem in miniature. The learning was permitted. The taking was not. The capability now sitting in open weights of this era was built, in part, on a library that was assembled by theft, and a court attached a billion-and-a-half-dollar price to the theft specifically.
Then there is the labor. To make a base model usable, to teach it not to emit the worst of what it absorbed, someone has to sit and label the worst of what it absorbed. In 2023, TIME reported that the data annotators in Kenya who did that work for the model that started this whole cycle were paid roughly $1.32 to $2 per hour. They read and tagged descriptions of the most violent and degrading material on the internet so that the polished assistant could refuse to produce it. That labor is not upstream of the weights in some abstract sense. It is in the weights. The refusals, the safety behavior, the very usability that makes the model pleasant to run on my desk, were produced by people paid two dollars an hour to absorb harm on the model’s behalf.
Self-hosting does nothing about either of these. The books stay pirated. The $1.32 to $2 per hour stays paid, which is to say underpaid, and already spent. The harm is not flowing now; it is congealed, frozen into the checkpoint at the moment of training, and I load all of it into memory every time I start the server. That is the backward column of the table. It is not hypothetical and it is not small, and no amount of forward refusal reaches back to settle it.
The HeLa shape: benefit does not launder the taking
This shape is older than AI, and the clearest case for it is a person. In 1951, cells were taken from Henrietta Lacks, a Black woman, during cancer treatment at Johns Hopkins, without her knowledge or consent. Those cells, named HeLa, became the first human cell line that would not die in culture. They were mass-produced and sold worldwide and they underpinned decades of biomedical research and a commercial industry built on top of them. Her family was left uninformed for years and uncompensated for decades.
I am not equating a checkpoint with a human being, and I want to be careful here, because the wrong I am pointing at is hers and it deserves to be named as hers. What I am borrowing is the structure, because the structure is exact. An enormous and genuinely useful enterprise was built on material taken without consent. And every later good use of a HeLa-derived result, every vaccine and assay and cure that the cell line made possible, does not undo the original taking. It inherits it. The benefit is real. The taking is real. Using the benefit honestly means carrying both at once and never letting the first cancel the second.
That is the precise shape of running an open model whose weights congealed from scraped, unconsented text. The capability is real and I use it. The original taking is real and it was never paid. A later honest result does not reach back and settle the corpus it stands on, any more than a cure reaches back to ask Henrietta Lacks. The most a downstream user can do is refuse the laundering: keep the benefit and the debt in the same sentence, and say plainly that one was never consented and never paid for.
The steelman: this is moral laundering
Now the hardest objection, stated as strongly as I can make it, because steelmanning the attack on your own position is the only version of this that is worth reading.
The objection goes like this. The entire extraction is upstream. By the time those weights reach your disk, the books are already pirated, the annotators already underpaid, the surplus already harvested from millions of users who fed the base model. Every gram of value you get from that checkpoint, every one of your 60 tokens a second, is a dividend on stolen labor and stolen text. Self-hosting changes none of that. What it changes is how you feel. You get to sit at your sovereign desk, prompts safely local, telling yourself you broke the data relation, while you draw down capability that exists only because the relation was never broken for the people whose work built it. That is not sovereignty. That is moral laundering: routing benefit through a clean-looking local process so the dirty origin stops bothering your conscience. The person renting the API is at least honest about being inside the machine. You took the same stolen output, moved it onto your own hardware, and called the move a refusal. The clean feeling is the most extracted product of all.
I think that objection is largely correct, and I am not going to wriggle out of it. The benefit I draw from these weights is, in fact, downstream of harm I did not pay for and cannot undo. The clean feeling is, in fact, a risk, and an honest writer should distrust it precisely because it is so comfortable.
What I am actually claiming
So here is the partial, honest answer, and it is partial on purpose, because the move that would make it whole is the dishonest one.
The forward refusal is real and worth something. Cutting your prompts, your feedback, and your marginal labor out of the extraction machine is not nothing just because it fails to also be everything. A clean future does not buy back a stolen past, but refusing to steal further is still worth doing on its own terms, and “it doesn’t fix the past, so it doesn’t count” is the kind of all-or-nothing reasoning that ends with you back inside the API because at least there you stopped pretending. Quarantining future harm is a real act even when remediation is impossible.
The backward debt is also real, and it is unpaid. I did not write the $1.5 billion settlement check. I did not raise the Kenyan annotators’ wages from $1.32 to $2 an hour to something a person can live on. Running the model on my own iron settles none of that, and the value I extract from the checkpoint is partly a transfer from people who were not asked and were not fairly paid. That debt does not get smaller because my prompts stay local.
The dishonest move is to use either of these truths to erase the other. I could lean on the forward win and call myself clean, which is the laundering the steelman correctly names. Or I could lean on the backward debt and conclude that nothing matters, that since the weights are tainted I might as well rent and stop pretending, which is just despair wearing the costume of rigor. Both of those are ways of escaping the discomfort of holding two true things at once. The honest posture is to refuse both exits. Self-hosting quarantines future harm. It does not remediate past harm. Both sentences are true, and the moment I let one of them swallow the other, I have started lying.
There is a practical edge to this beyond the conscience-keeping. If the backward debt is real, it implies obligations that self-hosting alone does not discharge: paying the open-source and dataset commons you draw from, supporting the authors and the labeling-rights fights, refusing to launder the clean feeling into a marketing claim that this stack is ethically settled. It is not settled. It is quarantined going forward and indebted backward, and writing that down is the most I can honestly do from here.
This is essay eight of a series, and it is the one I least wanted to write, which is usually the sign that it was the one worth writing. The series spine explains why each essay concedes its strongest objection before it answers it, and this essay is the limit case: the objection is conceded and only partly answered, because a full answer would be a lie. The structured, complete version of the argument is the forthcoming book, for which these essays are the public workshop. If you came here looking for a clean conclusion, the philosophy page is where the whole uncomfortable arc lives, and the absence of a clean conclusion is the point.
The two ledgers self-hosting keeps
The forward column is real. The backward column is unpaid.