What is fine-tuning?
Fine-tuning starts from a model that has already been trained on a large general corpus and trains it further on a smaller set of examples you provide. The point is to shift behaviour: a house style, a stricter output format, a domain the base model handles only roughly. Because the change goes into the weights, the model carries the new behaviour without you having to spell it out in every prompt. That is the appeal, and also the catch, since changing the weights can quietly erode skills the base model already had.
How is it different from retrieval?
Fine-tuning and retrieval solve different problems and people mix them up often. Retrieval feeds the model fresh information at request time, so it is the right tool when the facts change: prices, documents, anything you would otherwise have to retrain to update. Fine-tuning bakes patterns in, which suits stable things like style and format but ages badly for facts. A good rule: if the answer is “the model does not know X”, reach for retrieval; if it is “the model knows X but keeps phrasing it wrong”, consider fine-tuning.
Do you actually need it?
Usually less often than it feels. A clearer prompt or a few worked examples in the context fixes many problems people reach for fine-tuning to solve, with no training run and nothing to maintain. When you do fine-tune, lighter adapters such as low-rank adaptation (LoRA) train a small add-on instead of every weight, which is cheaper and easier to undo. Start with the cheapest tool that works and escalate only when it does not.