Skip to main content

Integrate Your Monitoring System With PagerDuty Using Events API V2

· 2 min read
Hrishikesh Barua
Founder @IncidentHub.cloud

PagerDuty's Events API V2 lets you push events from your monitoring systems to PagerDuty. You can push such events when there is a triggered, updated, or resolved incident.

The lifecycle of an incident will typically go through these states

StateTriggered BySource
TriggeredAutomaticMonitoring system
AcknowledgedOn-call EngineerPagerDuty app/Phone call
UpdatedAutomaticMonitoring system
ResolvedOn-call EngineerPagerDuty app/Phone call

You can either use any of the PagerDuty client SDKs to send events, or roll out your own.

Self-hosted and SaaS monitoring tools have inbuilt PagerDuty integration where you need to provider the API key.

A typical event push will look look like this (example in NodeJS):

import { event } from "@pagerduty/pdjs";

.....
event({
"data": {
"routing_key": "Your-Routing-Key-Here",
"event_action": "trigger",
"dedup_key": DEDUP_KEY,
"payload": {
"summary": "Event processor in us-east-1",
"source": "rnmd-2398.xyzcloud.io",
"severity": "critical",
"timestamp": "2024-07-17T08:42:58.315+0000",
},
"links": [
{
"href": "https://incidenthub.cloud/dashboard",
"text": "Go to dashboard",
},
],
},
.....

When your monitoring system sends this event to trigger an incident, it's important to have a unique DEDUP_KEY. This field determines whether subsequent events for this incident will be grouped together in PagerDuty. When your system sends an update, or a resolved event, the DEDUP_KEY must match the one sent during the trigger call. In other words, the DEDUP_KEY must be unique per incident.

IncidentHub integrates with PagerDuty and uses the incident's public URL as the DEDUP_KEY as that is unique globally, and also remains the same for an incident. Each incident update event has the same DEDUP_KEY.

Let us look at a Google Cloud example. An incident affecting Anthos Service Mesh in Nov 2023 went through 4 updates including trigger and resolve. The URL remained the same for the incident as it went through the lifecycle.

References