MTTR
Mean Time To Resolution (or Repair)
MTTR (Mean Time To Resolution or Mean Time To Repair) measures the average time from when an incident is detected to when it's fully resolved and service is restored.
How to Calculate MTTR
MTTR = Total downtime / Number of incidents
For example, if you had 10 incidents last month totaling 500 minutes of downtime: MTTR = 500 / 10 = 50 minutes
Why MTTR Matters
MTTR directly impacts: - Customer experience: Longer outages = more frustrated customers - Revenue: Downtime costs enterprises $5,600+ per minute - Team health: High MTTR correlates with on-call burnout - Reputation: Frequent, long outages damage trust
Industry Benchmarks
| Performance Level | MTTR |
|---|---|
| Elite | < 1 hour |
| High | < 4 hours |
| Medium | < 24 hours |
| Low | > 24 hours |
*Source: DORA State of DevOps Report*
How to Reduce MTTR
1. Centralize operational context - Stop wasting time gathering information from 10+ tools 2. Automate detection - Catch issues before customers report them 3. Improve runbooks - Documented procedures speed resolution 4. Practice incident response - Teams that drill have faster response times 5. Conduct postmortems - Learn from each incident to prevent repeats