Step-by-step playbooks for when things go sideways. Interactive checklists with copy-paste commands.
Diagnose and fix a Kubernetes node in NotReady state. Covers kubelet health, container runtime, resource exhaustion, and node condition analysis.
Recover from a stuck DynamoDB/Consul/S3 state lock without corrupting state. Covers identifying the holder, verifying the prior run actually died, and when force-unlock is safe.
Click through symptoms to diagnose why your Kubernetes pod won't start. Covers CrashLoopBackOff, ImagePullBackOff, Pending, and Error states with targeted fix commands.