Skip to content

Runbooks

Each Pulse runbook is structured for the on-call engineer at 3 AM:

  1. What this alert means — one paragraph
  2. Where to look first — the exact dashboard panel / endpoint / log query
  3. Likely causes, in priority order — and the diagnostic that distinguishes them
  4. Mitigation — the safe, reversible action
  5. Permanent fix — what the post-mortem should track

Available runbooks

Runbook Triggered by
Cardinality firewall overflow PulseCardinalityOverflow alert
Timeout-budget exhausted PulseTimeoutBudgetExhausted alert
Trace context missing PulseTraceContextMissing alert
Error-budget burn Any SLO multi-window burn-rate alert generated by SLO-as-code
HikariCP saturation PulseHikariCpSaturated alert

The full alert set lives in alerts/prometheus/pulse-slo-alerts.yaml and is auto-loaded into the bundled Prometheus instance in the local stack.