Backend & Data
Rate limits and exponential backoff
Every serious API has a bouncer. Ask for too much too fast and it says "slow down" — usually with a 429 Too Many Requests. What you do next is the difference between a system that stays healthy and one that digs its own hole.
Why limits exist
Rate limits protect a service from being overwhelmed and keep things fair between clients. They're not an insult — they're the API telling you its capacity. The mistake is treating a 429 as "try again immediately," because a wall of instant retries is exactly what a struggling server can least afford. That's how a small blip snowballs into an outage.
Back off — exponentially
Exponential backoff means each retry waits about twice as long as the one before: 1 second, then 2, then 4, then 8. The gaps grow quickly, so a client that's being turned away naturally eases off instead of piling on.
Two refinements make it production-grade:
- Add jitter. If a thousand clients all back off on the exact same schedule, they'll retry in synchronised waves — a "thundering herd." Sprinkle in some randomness so everyone's retries spread out.
- Respect
Retry-After. Many APIs tell you exactly how long to wait in a response header. If they do, honour it — it beats guessing.
Got a 429? Don't retry immediately. Wait, then double the wait each time, add a little randomness — and stop after a sensible number of tries.
The instinct under pressure is to push harder. With rate limits, the reliable move is the opposite: back off, spread out, and let the system breathe.