The Alertmanager Config Secret contains the configuration of an Alertmanager instance that sends out notifications based on alerts it receives from Prometheus.

Overview

By default, Rancher Monitoring deploys a single Alertmanager onto a cluster that uses a default Alertmanager Config Secret. As part of the chart deployment options, you can opt to increase the number of replicas of the Alertmanager deployed onto your cluster that can all be managed using the same underlying Alertmanager Config Secret.

This Secret should be updated or modified any time you want to:

  • Add in new notifiers or receivers
  • Change the alerts that should be sent to specific notifiers or receivers
  • Change the group of alerts that are sent out

By default, you can either choose to supply an existing Alertmanager Config Secret (i.e. any Secret in the cattle-monitoring-system namespace) or allow Rancher Monitoring to deploy a default Alertmanager Config Secret onto your cluster. By default, the Alertmanager Config Secret created by Rancher will never be modified / deleted on an upgrade / uninstall of the rancher-monitoring chart to prevent users from losing or overwriting their alerting configuration when executing operations on the chart.

For more information on what fields can be specified in this secret, please look at the Prometheus Alertmanager docs.

The full spec for the Alertmanager configuration file and what it takes in can be found here.

For more information, refer to the official Prometheus documentation about configuring routes.

Connecting Routes and PrometheusRules

When you define a Rule (which is declared within a RuleGroup in a PrometheusRule resource), the spec of the Rule itself contains labels that are used by Prometheus to figure out which Route should receive this Alert. For example, an Alert with the label team: front-end will be sent to all Routes that match on that label.

Creating Receivers in the Rancher UI

Available as of v2.5.4

Prerequisites:

  • The monitoring application needs to be installed.
  • If you configured monitoring with an existing Alertmanager Secret, it must have a format that is supported by Rancher’s UI. Otherwise you will only be able to make changes based on modifying the Alertmanager Secret directly. Note: We are continuing to make enhancements to what kinds of Alertmanager Configurations we can support using the Routes and Receivers UI, so please file an issue if you have a request for a feature enhancement.

To create notification receivers in the Rancher UI,

  1. Click Cluster Explorer > Monitoring and click Receiver.
  2. Enter a name for the receiver.
  3. Configure one or more providers for the receiver. For help filling out the forms, refer to the configuration options below.
  4. Click Create.

Result: Alerts can be configured to send notifications to the receiver(s).

Receiver Configuration

The notification integrations are configured with the receiver, which is explained in the Prometheus documentation.

Native vs. Non-native Receivers

By default, AlertManager provides native integration with some receivers, which are listed in this section. All natively supported receivers are configurable through the Rancher UI.

For notification mechanisms not natively supported by AlertManager, integration is achieved using the webhook receiver. A list of third-party drivers providing such integrations can be found here. Access to these drivers, and their associated integrations, is provided through the Alerting Drivers app. Once enabled, configuring non-native receivers can also be done through the Rancher UI.

Currently the Rancher Alerting Drivers app provides access to the following integrations: - Microsoft Teams, based on the prom2teams driver - SMS, based on the Sachet driver

Changes in Rancher v2.5.8

Rancher v2.5.8 added Microsoft Teams and SMS as configurable receivers in the Rancher UI.

Changes in Rancher v2.5.4

Rancher v2.5.4 introduced the capability to configure receivers by filling out forms in the Rancher UI.

The following types of receivers can be configured in the Rancher UI:

The custom receiver option can be used to configure any receiver in YAML that cannot be configured by filling out the other forms in the Rancher UI.

Slack

FieldTypeDescription
URLStringEnter your Slack webhook URL. For instructions to create a Slack webhook, see the Slack documentation.
Default ChannelStringEnter the name of the channel that you want to send alert notifications in the following format: #<channelname>.
Proxy URLStringProxy for the webhook notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Email

FieldTypeDescription
Default Recipient AddressStringThe email address that will receive notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

SMTP options:

FieldTypeDescription
SenderStringEnter an email address available on your SMTP mail server that you want to send the notification from.
HostStringEnter the IP address or hostname for your SMTP server. Example: smtp.email.com.
Use TLSBoolUse TLS for encryption.
UsernameStringEnter a username to authenticate with the SMTP server.
PasswordStringEnter a password to authenticate with the SMTP server.

PagerDuty

FieldTypeDescription
Integration TypeStringEvents API v2 or Prometheus.
Default Integration KeyStringFor instructions to get an integration key, see the PagerDuty documentation.
Proxy URLStringProxy for the PagerDuty notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Opsgenie

FieldDescription
API KeyFor instructions to get an API key, refer to the Opsgenie documentation.
Proxy URLProxy for the Opsgenie notifications.
Enable Send Resolved AlertsWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Opsgenie Responders:

FieldTypeDescription
TypeStringSchedule, Team, User, or Escalation. For more information on alert responders, refer to the Opsgenie documentation.
Send ToStringId, Name, or Username of the Opsgenie recipient.

Webhook

FieldDescription
URLWebhook URL for the app of your choice.
Proxy URLProxy for the webhook notification.
Enable Send Resolved AlertsWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Custom

The YAML provided here will be directly appended to your receiver within the Alertmanager Config Secret.

Teams

Enabling the Teams Receiver for Rancher Managed Clusters

The Teams receiver is not a native receiver and must be enabled before it can be used. You can enable the Teams receiver for a Rancher managed cluster by going to the Apps page and installing the rancher-alerting-drivers app with the Teams option selected.

  1. In the Rancher UI, go to the cluster where you want to install rancher-alerting-drivers and click Cluster Explorer.
  2. Click Apps.
  3. Click the Alerting Drivers app.
  4. Click the Helm Deploy Options tab
  5. Select the Teams option and click Install.
  6. Take note of the namespace used as it will be required in a later step.

Configure the Teams Receiver

The Teams receiver can be configured by updating its ConfigMap. For example, the following is a minimal Teams receiver configuration.

  1. [Microsoft Teams]
  2. teams-instance-1: https://your-teams-webhook-url

When configuration is complete, add the receiver using the steps in this section.

Use the example below as the URL where:

  • ns-1 is replaced with the namespace where the rancher-alerting-drivers app is installed
  1. url: http://rancher-alerting-drivers-prom2teams.ns-1.svc:8089/v2/teams-instance-1

SMS

Enabling the SMS Receiver for Rancher Managed Clusters

The SMS receiver is not a native receiver and must be enabled before it can be used. You can enable the SMS receiver for a Rancher managed cluster by going to the Apps page and installing the rancher-alerting-drivers app with the SMS option selected.

  1. In the Rancher UI, go to the cluster where you want to install rancher-alerting-drivers and click Cluster Explorer.
  2. Click Apps.
  3. Click the Alerting Drivers app.
  4. Click the Helm Deploy Options tab
  5. Select the SMS option and click Install.
  6. Take note of the namespace used as it will be required in a later step.

Configure the SMS Receiver

The SMS receiver can be configured by updating its ConfigMap. For example, the following is a minimal SMS receiver configuration.

  1. providers:
  2. telegram:
  3. token: 'your-token-from-telegram'
  4. receivers:
  5. - name: 'telegram-receiver-1'
  6. provider: 'telegram'
  7. to:
  8. - '123456789'

When configuration is complete, add the receiver using the steps in this section.

Use the example below as the name and URL, where:

  • the name assigned to the receiver, e.g. telegram-receiver-1, must match the name in the receivers.name field in the ConfigMap, e.g. telegram-receiver-1
  • ns-1 in the URL is replaced with the namespace where the rancher-alerting-drivers app is installed
  1. name: telegram-receiver-1
  2. url http://rancher-alerting-drivers-sachet.ns-1.svc:9876/alert

The following types of receivers can be configured in the Rancher UI:

The custom receiver option can be used to configure any receiver in YAML that cannot be configured by filling out the other forms in the Rancher UI.

Slack

FieldTypeDescription
URLStringEnter your Slack webhook URL. For instructions to create a Slack webhook, see the Slack documentation.
Default ChannelStringEnter the name of the channel that you want to send alert notifications in the following format: #<channelname>.
Proxy URLStringProxy for the webhook notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Email

FieldTypeDescription
Default Recipient AddressStringThe email address that will receive notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

SMTP options:

FieldTypeDescription
SenderStringEnter an email address available on your SMTP mail server that you want to send the notification from.
HostStringEnter the IP address or hostname for your SMTP server. Example: smtp.email.com.
Use TLSBoolUse TLS for encryption.
UsernameStringEnter a username to authenticate with the SMTP server.
PasswordStringEnter a password to authenticate with the SMTP server.

PagerDuty

FieldTypeDescription
Integration TypeStringEvents API v2 or Prometheus.
Default Integration KeyStringFor instructions to get an integration key, see the PagerDuty documentation.
Proxy URLStringProxy for the PagerDuty notifications.
Enable Send Resolved AlertsBoolWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Opsgenie

FieldDescription
API KeyFor instructions to get an API key, refer to the Opsgenie documentation.
Proxy URLProxy for the Opsgenie notifications.
Enable Send Resolved AlertsWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Opsgenie Responders:

FieldTypeDescription
TypeStringSchedule, Team, User, or Escalation. For more information on alert responders, refer to the Opsgenie documentation.
Send ToStringId, Name, or Username of the Opsgenie recipient.

Webhook

FieldDescription
URLWebhook URL for the app of your choice.
Proxy URLProxy for the webhook notification.
Enable Send Resolved AlertsWhether to send a follow-up notification if an alert has been resolved (e.g. [Resolved] High CPU Usage).

Custom

The YAML provided here will be directly appended to your receiver within the Alertmanager Config Secret.

The Alertmanager must be configured in YAML, as shown in these examples.

Route Configuration

Receiver

The route needs to refer to a receiver that has already been configured.

Grouping

FieldDefaultDescription
Group ByN/aThe labels by which incoming alerts are grouped together. For example, [ group_by: ‘[‘ <labelname>, … ‘]’ ] Multiple alerts coming in for labels such as cluster=A and alertname=LatencyHigh can be batched into a single group. To aggregate by all possible labels, use the special value ‘…’ as the sole label name, for example: group_by: [‘…’] Grouping by effectively disables aggregation entirely, passing through all alerts as-is. This is unlikely to be what you want, unless you have a very low alert volume or your upstream notification system performs its own grouping.
Group Wait30sHow long to wait to buffer alerts of the same group before sending initially.
Group Interval5mHow long to wait before sending an alert that has been added to a group of alerts for which an initial notification has already been sent.
Repeat Interval4hHow long to wait before re-sending a given alert that has already been sent.

Matching

The Match field refers to a set of equality matchers used to identify which alerts to send to a given Route based on labels defined on that alert. When you add key-value pairs to the Rancher UI, they correspond to the YAML in this format:

  1. match:
  2. [ <labelname>: <labelvalue>, ... ]

The Match Regex field refers to a set of regex-matchers used to identify which alerts to send to a given Route based on labels defined on that alert. When you add key-value pairs in the Rancher UI, they correspond to the YAML in this format:

  1. match_re:
  2. [ <labelname>: <regex>, ... ]

The Alertmanager must be configured in YAML, as shown in these examples.

Example Alertmanager Configs

Slack

To set up notifications via Slack, the following Alertmanager Config YAML can be placed into the alertmanager.yaml key of the Alertmanager Config Secret, where the api_url should be updated to use your Webhook URL from Slack:

  1. route:
  2. group_by: ['job']
  3. group_wait: 30s
  4. group_interval: 5m
  5. repeat_interval: 3h
  6. receiver: 'slack-notifications'
  7. receivers:
  8. - name: 'slack-notifications'
  9. slack_configs:
  10. - send_resolved: true
  11. text: '{{ template "slack.rancher.text" . }}'
  12. api_url: <user-provided slack webhook url here>
  13. templates:
  14. - /etc/alertmanager/config/*.tmpl

PagerDuty

To set up notifications via PagerDuty, use the example below from the PagerDuty documentation as a guideline. This example sets up a route that captures alerts for a database service and sends them to a receiver linked to a service that will directly notify the DBAs in PagerDuty, while all other alerts will be directed to a default receiver with a different PagerDuty integration key.

The following Alertmanager Config YAML can be placed into the alertmanager.yaml key of the Alertmanager Config Secret. The service_key should be updated to use your PagerDuty integration key and can be found as per the “Integrating with Global Event Routing” section of the PagerDuty documentation. For the full list of configuration options, refer to the Prometheus documentation.

  1. route:
  2. group_by: [cluster]
  3. receiver: 'pagerduty-notifications'
  4. group_interval: 5m
  5. routes:
  6. - match:
  7. service: database
  8. receiver: 'database-notifcations'
  9. receivers:
  10. - name: 'pagerduty-notifications'
  11. pagerduty_configs:
  12. - service_key: 'primary-integration-key'
  13. - name: 'database-notifcations'
  14. pagerduty_configs:
  15. - service_key: 'database-integration-key'

Example Route Config for CIS Scan Alerts

While configuring the routes for rancher-cis-benchmark alerts, you can specify the matching using the key-value pair job: rancher-cis-scan.

For example, the following example route configuration could be used with a Slack receiver named test-cis:

  1. spec:
  2. receiver: test-cis
  3. group_by:
  4. # - string
  5. group_wait: 30s
  6. group_interval: 30s
  7. repeat_interval: 30s
  8. match:
  9. job: rancher-cis-scan
  10. # key: string
  11. match_re:
  12. {}
  13. # key: string

For more information on enabling alerting for rancher-cis-benchmark, see this section.