Alertmanager

The Alertmanager handles alertssent by client applications such as the Prometheus server.It takes care of deduplicating, grouping, and routingthem to the correct receiver integration such as email, PagerDuty, or OpsGenie.It also takes care of silencing and inhibition of alerts.

The following describes the core concepts the Alertmanager implements. Consultthe configuration documentation to learn how to use themin more detail.

Grouping

Grouping categorizes alerts of similar nature into a single notification. Thisis especially useful during larger outages when many systems fail at once andhundreds to thousands of alerts may be firing simultaneously.

Example: Dozens or hundreds of instances of a service are running in yourcluster when a network partition occurs. Half of your service instancescan no longer reach the database.Alerting rules in Prometheus were configured to send an alert for each serviceinstance if it cannot communicate with the database. As a result hundreds ofalerts are sent to Alertmanager.

As a user, one only wants to get a single page while still being able to seeexactly which service instances were affected. Thus one can configureAlertmanager to group alerts by their cluster and alertname so it sends asingle compact notification.

Grouping of alerts, timing for the grouped notifications, and the receiversof those notifications are configured by a routing tree in the configurationfile.

Inhibition

Inhibition is a concept of suppressing notifications for certain alerts ifcertain other alerts are already firing.

Example: An alert is firing that informs that an entire cluster is notreachable. Alertmanager can be configured to mute all other alerts concerningthis cluster if that particular alert is firing.This prevents notifications for hundreds or thousands of firing alerts thatare unrelated to the actual issue.

Inhibitions are configured through the Alertmanager's configuration file.

Silences

Silences are a straightforward way to simply mute alerts for a given time.A silence is configured based on matchers, just like the routing tree. Incomingalerts are checked whether they match all the equality or regular expressionmatchers of an active silence.If they do, no notifications will be sent out for that alert.

Silences are configured in the web interface of the Alertmanager.

Client behavior

The Alertmanager has special requirements for behavior of itsclient. Those are only relevant for advanced use cases where Prometheusis not used to send alerts.

High Availability

Alertmanager supports configuration to create a cluster for high availability.This can be configured using the —cluster-* flags.

It's important not to load balance traffic between Prometheus and its Alertmanagers, but instead, point Prometheus to a list of all Alertmanagers.