SLO / Error Budget Calculator

Calculate error budgets, allowed downtime, and burn rate alerts from SLO targets

SLO Target

Availability Target (%)

Time Window

Optional Inputs

Requests per Minute (0 = skip)

MTTR — Mean Time to Repair (minutes, 0 = skip)

Request rate enables request-based error budget. MTTR enables incident count estimation.

0.100%

Error Budget

43m 12s

Allowed Downtime

43.2K

Allowed Errors

Max Incidents (MTTR)

Detailed Breakdown

SLO Target

99.9%

Error Budget

0.1000%

Time Window

Month (30 days)

Total Minutes

43,200

Downtime Budget

43.20 minutes

Downtime Formatted

43m 12s

Total Requests

43.20M

Allowed Errors

43.2K

Errors / Minute

1.00

Max Incidents (30m MTTR)

Multi-Burn-Rate Alerts (30-day window)

Based on the Google SRE Workbook pattern. Each alert triggers when the error rate exceeds the burn rate threshold.

Critical (fast burn)

Lookback: 1h | Budget consumed: 2% | Burn rate: 14.4x | Max downtime: 52s

Warning (medium burn)

Lookback: 6h | Budget consumed: 5% | Burn rate: 6x | Max downtime: 2m 10s

Ticket (slow burn)

Lookback: 3d | Budget consumed: 10% | Burn rate: 1x | Max downtime: 4m 19s

Export Configurations

slo-rules.yml

# Multi-window Multi-Burn-Rate Alerting Rules
# Based on Google SRE Workbook Chapter 5
# SLO Target: 99.900% | Error Budget: 0.09999999999999432%

groups:
  - name: slo.my-service.rules
    rules:
      # ── Recording rules ──────────────────────────
      - record: slo:sli_error:ratio_rate5m
        expr: |
          sum(rate(http_requests_total{service="my-service",code=~"5.."}[5m]))
          /
          sum(rate(http_requests_total{service="my-service"}[5m]))

      - record: slo:sli_error:ratio_rate30m
        expr: |
          sum(rate(http_requests_total{service="my-service",code=~"5.."}[30m]))
          /
          sum(rate(http_requests_total{service="my-service"}[30m]))

      - record: slo:sli_error:ratio_rate1h
        expr: |
          sum(rate(http_requests_total{service="my-service",code=~"5.."}[1h]))
          /
          sum(rate(http_requests_total{service="my-service"}[1h]))

      - record: slo:sli_error:ratio_rate6h
        expr: |
          sum(rate(http_requests_total{service="my-service",code=~"5.."}[6h]))
          /
          sum(rate(http_requests_total{service="my-service"}[6h]))

      - record: slo:sli_error:ratio_rate3d
        expr: |
          sum(rate(http_requests_total{service="my-service",code=~"5.."}[3d]))
          /
          sum(rate(http_requests_total{service="my-service"}[3d]))

      # ── Error budget remaining ───────────────────
      - record: slo:error_budget:remaining
        expr: |
          1 - (
            slo:sli_error:ratio_rate30d / 0.001000
          )

      # ── Alerting rules (Multi-Burn-Rate) ─────────

      # Critical: 2% of 30-day budget consumed in 1 hour (burn rate 14.4x)
      - alert: SLOBurnRateCritical
        expr: |
          slo:sli_error:ratio_rate1h > (14.4 * 0.001000)
          and
          slo:sli_error:ratio_rate5m > (14.4 * 0.001000)
        for: 2m
        labels:
          severity: critical
          service: my-service
          slo: availability
        annotations:
          summary: "High burn rate on SLO (critical)"
          description: "Error rate is consuming error budget 14.4x faster than expected. At this rate, the entire monthly budget will be exhausted in {{ printf \"%.0f\" (div 720 14.4) }} minutes."

      # Warning: 5% of 30-day budget consumed in 6 hours (burn rate 6x)
      - alert: SLOBurnRateWarning
        expr: |
          slo:sli_error:ratio_rate6h > (6 * 0.001000)
          and
          slo:sli_error:ratio_rate30m > (6 * 0.001000)
        for: 5m
        labels:
          severity: warning
          service: my-service
          slo: availability
        annotations:
          summary: "Elevated burn rate on SLO (warning)"
          description: "Error rate is consuming error budget 6x faster than expected. At this rate, the entire monthly budget will be exhausted in {{ printf \"%.0f\" (div 720 6) }} hours."

      # Ticket: 10% of 30-day budget consumed in 3 days (burn rate 1x)
      - alert: SLOBurnRateTicket
        expr: |
          slo:sli_error:ratio_rate3d > (1 * 0.001000)
          and
          slo:sli_error:ratio_rate6h > (1 * 0.001000)
        for: 30m
        labels:
          severity: info
          service: my-service
          slo: availability
        annotations:
          summary: "Slow burn on SLO (ticket)"
          description: "Error rate is steadily consuming the error budget. Current trajectory will exhaust the monthly budget within 30 days."

The Nines — Availability Reference

Monthly Downtime	Quarterly	Yearly
72 hours	9 days	36.5 days
36 hours	4.5 days	18.25 days
7h 18m	21h 54m	3d 15h 36m
3h 39m	10h 57m	1d 19h 48m
43m 50s	2h 11m	8h 45m 36s
21m 55s	1h 5m	4h 22m 48s
4m 23s	13m 9s	52m 34s
26.3s	1m 19s	5m 15s

About the SLO / Error Budget Calculator

This tool calculates error budgets, allowed downtime, and burn rate alert thresholds from Service Level Objective (SLO) targets. It implements the Multi-Window Multi-Burn-Rate alerting pattern from the Google SRE Workbook.

What is an SLO?

A Service Level Objective (SLO) is a target reliability level for a service, expressed as a percentage (e.g., 99.9% availability). The gap between 100% and the SLO target is the error budget — the acceptable amount of unreliability. For a 99.9% SLO over 30 days, the error budget is 0.1%, which translates to about 43 minutes of allowed downtime per month.

Multi-Burn-Rate Alerts

Simple threshold alerts trigger too late (slow burns) or too often (fast burns). The multi-burn-rate pattern uses three alert tiers with different lookback windows: a 1-hour window for critical fast burns (14.4x rate), a 6-hour window for warning-level burns (6x rate), and a 3-day window for slow burns that generate tickets (1x rate). This approach provides fast detection without excessive noise.

Export Formats

The calculator generates three export formats: Prometheus alerting rules with recording rules and multi-burn-rate alerts, OpenSLO YAML (the vendor-neutral open standard for SLO definitions), and Sloth config (a popular Prometheus SLO framework that generates recording and alerting rules).

How It Works

Everything runs in your browser. Your inputs are never sent to any server. The calculator computes error budgets, translates them to allowed downtime and failed requests, and generates production-ready alerting configurations that you can copy directly into your monitoring stack.

Related Tools & Resources

K8s Manifest Generator

Generate K8s resources with SLO-aware configs

CI/CD Pipeline Generator

Implement SLO-driven deployment gates

HAProxy Config Generator

Configure health checks aligned with SLO targets

Nginx Config Generator

Set up monitoring endpoints for SLO tracking