Backend & Data
Circuit breakers — stopping one failure from taking down everything
One slow service shouldn't be able to take down your whole system — but it happens constantly. Service B gets slow, so every call to it hangs. Service A's threads all pile up waiting on B, so A stops responding too. Now C, which depends on A, falls over. A single failure cascades into an outage. The circuit breaker — borrowed straight from the electrical panel in your house — is how you stop the cascade.
The idea
Wrap calls to a shaky dependency in a breaker that watches for failures. While things are healthy, calls pass through. Once failures pile up, the breaker trips: it stops calling the failing service entirely and fails fast instead — instantly returning an error or a fallback rather than making everyone wait. After a cooldown, it cautiously tests whether the service has recovered.
The three states
- Closed — normal. Requests flow through; the breaker just counts failures.
- Open — tripped. It stops calling the dependency and fails immediately (or serves a fallback), giving the struggling service room to recover instead of piling on.
- Half-open — testing the waters. After the cooldown it lets a single request through. Success → back to Closed. Failure → straight back to Open for another cooldown.
When something's clearly broken, stop hammering it. Fail fast for a while, then quietly check if it's back before reopening the floodgates.
Why "fail fast" is the kindness
It sounds backwards to give up quickly, but it protects both sides. Your service stays responsive instead of hanging every thread on a dead dependency, and the dependency gets a break from the retry storm that was keeping it down. It's the natural partner to rate limits and backoff: backoff spaces out your retries, and the breaker stops them entirely when retrying is clearly hopeless. Together they turn a cascading outage into a small, contained, self-healing blip.