What is an Incident? Definition, Types & When to Declare

Incident Definition

An incident is any unplanned event that disrupts or degrades a service, requiring a coordinated response to restore normal operations. Unlike routine issues that can be handled through standard processes, incidents demand immediate attention and often involve multiple people or teams.

Key Characteristics of an Incident

• Impact: Affects users, business operations, or system health
• Urgency: Requires immediate or near-immediate attention
• Coordination: Often needs multiple people to resolve
• Visibility: Should be tracked and documented

Alerts vs Incidents vs Outages

These terms are often confused, but understanding the distinction is crucial for effective incident management.

Alert

An automated notification triggered by a monitoring system when a metric crosses a threshold.

Example: "CPU usage exceeded 80% on server-prod-01"

Not all alerts become incidents. Many alerts are noise or self-resolving.

Incident

A declared event requiring coordinated response. May be triggered by alerts, user reports, or proactive detection.

Example: "Checkout flow failing for 15% of users"

Incidents are explicitly declared and tracked with a lifecycle.

Outage

Complete unavailability of a service or critical function. All outages are incidents, but not all incidents are outages.

Example: "Website completely unreachable"

Outages are typically SEV1 or SEV0 (highest severity) incidents.

When to Declare an Incident

Knowing when to declare an incident is one of the most important skills in incident management. Declare too early and you create unnecessary overhead. Declare too late and you waste precious response time.

Declare an incident when:

Users are impacted — Even a small percentage of users experiencing issues warrants attention
Revenue is at risk — Anything affecting transactions, sign-ups, or billing
Data integrity is threatened — Data loss, corruption, or security concerns
SLAs are breached or at risk — You're outside your error budget
Multiple alerts are firing — Correlated alerts suggest a systemic issue
You're unsure — When in doubt, declare. It's easier to downgrade than to catch up

Pro Tip: Lower the Bar for Declaration

Many teams set too high a bar for declaring incidents. This leads to "shadow incidents" that go untracked and unlearned from. Make it easy and low-friction to declare incidents. You can always close them quickly if they turn out to be non-issues.

Types of Incidents

Incidents come in various forms. Understanding the types helps with response planning and post-incident analysis.

By Impact

Customer-facing: Users directly experience the issue
Internal: Internal tools or processes are affected
Infrastructure: Underlying systems are degraded
Security: Security vulnerabilities or breaches

By Cause

Deployment-related: Issues caused by code changes
Infrastructure: Hardware, cloud, or network problems
Dependency: Third-party services failing
Capacity: Systems overwhelmed by load
Configuration: Misconfigurations or feature flags

What Happens After Declaration

Once declared, an incident enters a formal lifecycle:

Acknowledgment: Someone takes ownership
Assessment: Determine severity and impact
Communication: Notify stakeholders
Investigation: Find the root cause
Mitigation: Stop the bleeding
Resolution: Fix the underlying issue
Post-mortem: Learn and improve

For more details, see our guide on the incident lifecycle.

Best Practices

Document your incident definition — Everyone should know what qualifies as an incident
Make declaration easy — One command or button to start an incident
Use severity levels — Not all incidents need the same response (learn about severity levels)
Track everything — Every incident should be documented, even minor ones
Review regularly — Analyze incident trends to find systemic issues

Next Steps

Now that you understand what an incident is, continue learning:

What is an Incident?