APIEscalation
Escalation Policies
Escalation policies define a multi-stage notification workflow. When an alert fires and is not acknowledged within a configurable delay, the next stage is triggered. This ensures critical incidents reach the right people.
Endpoints
GET
/api/v1/escalation-policiesList all escalation policies for the tenant
POST
/api/v1/escalation-policiesCreate a new escalation policy
GET
/api/v1/escalation-policies/{id}Get a specific policy with all stages
PUT
/api/v1/escalation-policies/{id}Replace a policy and its stages
DELETE
/api/v1/escalation-policies/{id}Delete an escalation policy
Policy Schema
EscalationPolicy Model
class EscalationStage(BaseModel):
stage_order: int # 1, 2, 3... (ascending)
delay_minutes: int # Minutes to wait before this stage fires
targets: list[EscalationTarget] # Who to notify
class EscalationTarget(BaseModel):
type: str # "user" | "group" | "schedule" | "channel"
id: str # UUID of target entity
channels: list[str] # ["email", "sms", "slack"]
class EscalationPolicyCreate(BaseModel):
name: str
description: Optional[str]
stages: list[EscalationStage] # Ordered list of stages
repeat_interval_minutes: int = 0 # 0 = no repeat after last stageEscalation Timing Logic
The APScheduler manages escalation timing using a dedicated job per active alert incident. The escalation engine works as follows:
Escalation State Machine
# When alert fires:
# t=0: Stage 1 fires immediately (delay_minutes=0)
# Notification sent to stage 1 targets
# t=15: If still unacknowledged after 15 min:
# Stage 2 fires
# Notification sent to stage 2 targets
# t=30: If still unacknowledged after 30 min:
# Stage 3 fires
# Notification sent to stage 3 targets
# t=90: If repeat_interval_minutes=60 and 60 min since stage 3:
# Cycle repeats from stage 1
# On acknowledgement: Escalation job is cancelled
# On resolution: Escalation job is cancelled + notify resolutionTarget Resolution
At notification time, the service resolves dynamic targets:
- User target — Directly maps to a user's contact info (email, phone)
- Group target — Expands to all active members of the group at notification time
- Schedule target — Resolves to the currently on-call user based on the schedule configuration
- Channel target — Sends to a configured notification channel (Slack webhook, email list, etc.)
Dynamic Resolution
Schedule targets are resolved at the moment of notification, not when the policy is created. This ensures rotations work correctly even if the on-call person changes between when the alert fires and when an escalation stage triggers.
Example: Create Policy
POST /api/v1/escalation-policies
{
"name": "Critical Redshift Alerts",
"description": "3-tier escalation for critical Redshift issues",
"stages": [
{
"stage_order": 1,
"delay_minutes": 0,
"targets": [
{
"type": "schedule",
"id": "on-call-dba-schedule-id",
"channels": ["sms", "email"]
}
]
},
{
"stage_order": 2,
"delay_minutes": 15,
"targets": [
{
"type": "group",
"id": "dba-team-group-id",
"channels": ["slack"]
}
]
},
{
"stage_order": 3,
"delay_minutes": 30,
"targets": [
{
"type": "user",
"id": "engineering-director-user-id",
"channels": ["email", "sms"]
}
]
}
],
"repeat_interval_minutes": 60
}