SLO
Service Level Objective
A Service Level Objective (SLO) is a target level of reliability for a service, expressed as a measurable goal (e.g., 99.9% availability).
SLO vs SLA vs SLI
SLI (Service Level Indicator): The metric you measure - Example: Request latency, error rate, availability
SLO (Service Level Objective): The target for that metric - Example: 99.9% of requests complete in <200ms
SLA (Service Level Agreement): The contract with consequences - Example: If we miss 99.9%, customer gets credits
Relationship: SLI → SLO → SLA
Why SLOs Matter
SLOs answer the critical question: "How reliable is good enough?"
Without SLOs: - Teams chase 100% reliability (impossible, expensive) - No framework for prioritizing reliability vs features - Arguments about what "reliable" means
Common SLO Types
Availability: % of time service is up - Example: 99.95% availability (22 min downtime/month)
Latency: Response time percentiles - Example: 99% of requests <200ms, 99.9% <1s
Error Rate: % of requests that fail - Example: <0.1% error rate
Throughput: Capacity delivered - Example: Support 10,000 requests/second
Setting Good SLOs
1. Start with user expectations 2. Measure current performance 3. Set achievable targets (with room to improve) 4. Define what "counts" clearly 5. Review and adjust quarterly