Blog

We all fall down: notes from an ordinary outage

A detailed post-mortem of a Kubernetes outage where a routine video processing job brought down an entire streaming cluster for five days.

We all fall down: notes from an ordinary outage