AI & LLMs
Fine-tuning vs RAG vs prompting — when to use which
"Should we fine-tune?" is the question that launches a thousand wasted GPU hours. Most of the time the honest answer is "no — you need RAG," or even "no — you need a better prompt." The three techniques aren't competitors; they fix three different problems. Pick by what you're actually missing.
Prompting — change the instructions
The cheapest, fastest lever, and the one to exhaust first. Clear instructions, a few examples, a defined output format — this solves a surprising share of "the model isn't doing what I want" problems in minutes, with zero infrastructure. If you haven't seriously iterated on the prompt, you're not ready to reach for anything heavier.
RAG — change the knowledge
Use it when the gap is facts the model doesn't have: your docs, your policies, anything private, recent, or too niche to be in training data. RAG injects the right information at question time. It does not teach the model a new skill or voice — it feeds it references.
Fine-tuning — change the behaviour
Use it when the gap is how the model responds, consistently: a specific tone, a rigid output structure, a specialised task it keeps fumbling even with good prompts. Fine-tuning bakes a pattern into the weights. It's the most expensive and slowest to iterate, and — crucially — it's a poor way to add facts (they go stale and it hallucinates around them).
RAG gives the model something to read. Fine-tuning changes how it writes. Prompting just tells it what to do. Diagnose the gap first, then pick.
And they compose: a mature system often prompts well, retrieves its facts, and fine-tunes only the last-mile behaviour that prompting couldn't nail. Start at the cheap end and only climb when you hit a wall the current rung genuinely can't clear.