SLA
Service Level Agreement
A Service Level Agreement (SLA) is a contractual commitment to deliver a specific level of service, with defined consequences (usually financial) for missing targets.
SLA vs SLO
| Aspect | SLO | SLA |
|---|---|---|
| Nature | Internal target | External contract |
| Consequence | Engineering focus | Financial/legal |
| Flexibility | Can adjust quarterly | Locked in contract |
| Target | Aspirational | Conservative |
Best practice: SLO should be stricter than SLA
If your SLA is 99.9%, your SLO might be 99.95%. This gives you buffer before contractual penalties.
Common SLA Components
1. Service description - What's covered? 2. Performance metrics - How is it measured? 3. Target levels - What's the commitment? 4. Measurement period - Monthly? Quarterly? 5. Exclusions - What doesn't count? 6. Remedies - What happens if missed?
SLA Best Practices
- Be specific - Vague SLAs lead to disputes - Measure accurately - Agree on how metrics are calculated - Set achievable targets - Don't promise what you can't deliver - Build in buffer - SLO > SLA - Define exclusions clearly - Maintenance windows, force majeure - Review regularly - As systems change, SLAs should too
The "Nines" of Availability
| Availability | Downtime/Year | Downtime/Month |
|---|---|---|
| 99% (two 9s) | 3.65 days | 7.3 hours |
| 99.9% (three 9s) | 8.76 hours | 43.8 minutes |
| 99.99% (four 9s) | 52.6 minutes | 4.38 minutes |
| 99.999% (five 9s) | 5.26 minutes | 26.3 seconds |