Integration reliability depends on quick triage and predictable ownership. This runbook keeps response times consistent.
Core checks every morning
- Failed job count by connector and region.
- Average sync delay against SLA.
- Payload validation failures grouped by source system.
Escalation path
- Level 1: retry and schema validation.
- Level 2: connector owner and business ops review.
- Level 3: executive escalation for SLA breach above 4 hours.