Why does a model make things up?
A language model is a text predictor. Given what came before, it produces the most plausible continuation. Plausible is not the same as true. Most of the time plausible text is also correct, because the training data was mostly correct, but when the model has a gap it fills it the same way it fills everything else: with the most likely-looking words. That is how you get an invented version number, a citation to a paper that does not exist, or an API call with exactly the right shape and the wrong name.
The trouble is not that the model is wrong. Everything is wrong sometimes. The trouble is that it is wrong in the same calm, fluent voice it uses when it is right. There is no built-in tell. The confidence you read is a property of the language, not of the facts.
How do you work with a model that hallucinates?
You stop trusting the tone and start checking the claim. The reliable moves all look the same: pin the output to something you can verify. Check version pins against the actual registry. Confirm a quoted file path by opening the file. Where a pattern is known, a plain text search beats asking the model, because the search cannot be charmed by a fluent guess.
The strongest pattern is to keep the model away from recall in the first place. Give it the facts in the prompt, from a retrieval step or a source of truth, and ask it to use them rather than remember them. You will still get the occasional confident fiction. You will just have a gate in front of it before it reaches anything that matters.