PagerDuty Integration

Introduction

StatusPage.io integrates with PagerDuty by parsing the webhooks that are sent when PagerDuty incidents are triggered, acknowledged, resolved, or otherwise updated.

General features of the integration include:

  • Ability to degrade one or more components when a PagerDuty incident is triggered
  • Ability to create a StatusPage.io incident from an Incident Template when a PagerDuty incident is triggered
  • Ability to wait for an Acknowledge signal from PagerDuty before creating the incident within StatusPage.io
  • Ability to update/resolve the StatusPage.io incident as the PagerDuty incident is updated/resolved - with support for different Incident Templates for each.
  • Ability to coalesce many concurrent PagerDuty incidents into a single StatusPage.io incident
  • Ability to ignore any PagerDuty incidents during an active StatusPage.io scheduled maintenance period
Table Of Contents

This is a pretty lengthy article, so here are some shortcuts that will let you hop around:

Step 1

Go to the Free Add-ons section in the StatusPage.io management portal, and enable the PagerDuty Add-on.

Step 1

Step 2

Once enabled, you'll need to input your PagerDuty subdomain, and a full access "v1 Legacy" API key (read-only won't work).

Step 2.1

Step 2.2

Step 3

Check out your list of services as pulled in via PagerDuty, and also check out the Automation Rules you can put in place to make sure your page shows exactly what you want

Step 3.1

Step 3.2

Step 4

If you wish, you can set up a few Incident Templates that will be used to automatically open up incidents on your page when a PagerDuty incident webhook is received on our end.

You can even embed data from the webhook directly into the incident title or the message body! See below for more information.

Step 4.1

Step 5

Once you're done setting up the optional Incident Templates, it's time to link up your PagerDuty services with a few rules that will translate into StatusPage.io actions. This can include degrading components, and opening/updating/resolving incidents. See below for a few sample configurations.

Step 5.1

Step 6

Automation Rules will depend heavily on how you use your status page. See below for more notes on common configurations.

Step 6

Public-facing Status Pages

For public facing status pages (most of our customers), you may want to err on the conservative side of communication, only ever allowing a single PagerDuty-related incident to be open at a time, and only after it has been acknowledged and confirmed as a real issue.

Private Status Pages

Private status page users may be more interested in getting their PagerDuty stream included in the incident workflow, and want to include updates for actions like escalations, reassignments, acknowledges, etc. Different incident templates can be configured for the different stages in the incident lifecycle, and can be configured to embed data from the webhook directly into the message update body.

Embedding PagerDuty Webhook Information

Starting with this integration, we've upgraded our Incident Templates to support the Mustache templating syntax, and the optional usage of PagerDuty webhook information directly in the incident name and message body.

The image below represents the level of data we make available to you under the pagerduty top-level key.

PagerDuty Mustache Data

For example, to inject the email of the user assigned to the incident, you would use {{pagerduty.incident.assigned_to_user.email}} in the Incident Template.

Seeing it in action

Below is a full scenario for a public facing status page - multiple services, multiple incidents, component downgrades, and and an incident opening up.

Get the account linked up, and automation settings configured

PagerDuty 7.1

PagerDuty 7.1

Configure a few Incident Templates to use

PagerDuty 7.1

PagerDuty 7.1

Map each of our PagerDuty services to StatusPage.io

PagerDuty 7.1

PagerDuty 7.1

PagerDuty 7.1

Our pristine status page

PagerDuty 7.1

Uh oh, EBS troubles again! Things slowing down.

PagerDuty 7.1

Components degraded, incident created

PagerDuty 7.1

EBS has completely choked, ThousandEyes says we're down

PagerDuty 7.1

Notice we didn't create another incident, and the components were further degraded

PagerDuty 7.1

FAQ

What exactly can be automated using the integration?

Using the integration, you can automate just component status changes, just incident creation/updating, or both component and incident changes. We frequently see customers using the integration to only automate components for public status pages, while incident automation is typically used for alerting a wider group of internal stakeholders or the whole company.

I have a large list of PagerDuty services. Which ones should I use for the integration?

Typically, we see a subset of services used to integrate with StatusPage. As a starting point, you'll want to integrate any services marked with high severity levels or signify customer facing incidents.

My services are fairly broad. As an example, all of our API and Dashboard alerts get piped into one 'Pingdom' service within PagerDuty? How can I map this to StatusPage?

Since StatusPage lets you map PagerDuty services to StatusPage components, you may need to make your PagerDuty services more granular. As an example, you could split out your 'Pingdom' service into 'Pingdom - API' and 'Pingdom - Dashboard'. This will give you greater control over your StatusPage automation and will also give you better reporting and analytics within PagerDuty.

I'd like to use the integration to notify a wider set of employees in my organization. What's pricing look like?

StatusPage has pricing specific to private, internal pages, which can be found here. Get in touch with us at hi@statuspage.io if you'd like to learn more.

How are components associated with incidents using PagerDuty/Incident Templates?

We do not associate components with an incident based on a component change from a PagerDuty rule. Components are only associated with an incident based on the selections made under "alert users subscribed to" in the incident template.