AI & LLMs

Fine-tuning vs RAG vs prompting — when to use which

2026-07-11 · 6 min read

"Should we fine-tune?" is the question that launches a thousand wasted GPU hours. Most of the time the honest answer is "no — you need RAG," or even "no — you need a better prompt." The three techniques aren't competitors; they fix three different problems. Pick by what you're actually missing.

Missing knowledge → RAG. Missing behaviour → fine-tune. Missing nothing but the request → prompt.

Prompting — change the instructions

The cheapest, fastest lever, and the one to exhaust first. Clear instructions, a few examples, a defined output format — this solves a surprising share of "the model isn't doing what I want" problems in minutes, with zero infrastructure. If you haven't seriously iterated on the prompt, you're not ready to reach for anything heavier.

RAG — change the knowledge

Use it when the gap is facts the model doesn't have: your docs, your policies, anything private, recent, or too niche to be in training data. RAG injects the right information at question time. It does not teach the model a new skill or voice — it feeds it references.

Fine-tuning — change the behaviour

Use it when the gap is how the model responds, consistently: a specific tone, a rigid output structure, a specialised task it keeps fumbling even with good prompts. Fine-tuning bakes a pattern into the weights. It's the most expensive and slowest to iterate, and — crucially — it's a poor way to add facts (they go stale and it hallucinates around them).

Rule of thumb

RAG gives the model something to read. Fine-tuning changes how it writes. Prompting just tells it what to do. Diagnose the gap first, then pick.

And they compose: a mature system often prompts well, retrieves its facts, and fine-tunes only the last-mile behaviour that prompting couldn't nail. Start at the cheap end and only climb when you hit a wall the current rung genuinely can't clear.

LLMFine-tuningRAGPrompting

← Back to the blog