1 - 告警


Notifiers and alerts are two features that work together to inform you of events in the Rancher system. Notifiers are objects that you configure to leverage popular IT services, which send you notification of Rancher events. Alerts are rule sets that trigger when those notifications are sent.

Notifiers and alerts are built on top of the Prometheus Alertmanager. Leveraging these tools, Rancher can notify cluster owners and project owners of events they need to address.

Notifiers

Before you can receive alerts, you must configure one or more notifier in Rancher.

Notifiers are services that inform you of alert events. You can configure notifiers to send alert notifications to staff best suited to take corrective action.

Notifiers are configured at the cluster level. This model ensures that only cluster owners need to configure notifiers, leaving project owners to simply configure alerts in the scope of their projects. You don’t need to dispense privileges like SMTP server access or cloud account access.

Rancher integrates with a variety of popular IT services, including:

  • Slack: Send alert notifications to your Slack channels.
  • Email: Choose email recipients for alert notifications.
  • PagerDuty: Route notifications to staff by phone, SMS, or personal email.
  • WebHooks: Update a webpage with alert notifications.

Adding Notifiers

Set up a notifier so that you can begin configuring and sending alerts.

  • From the Global View, open the cluster that you want to add a notifier to.

  • From the main menu, select Tools > Notifiers. Then click Add Notifier.

  • Select the service you want to use as your notifier, and then fill out the form.

Slack

  • Enter a Name for the notifier.
  • From Slack, create a webhook. For instructions, see the Slack Documentation.
  • From Rancher, enter your Slack webhook URL.
  • Enter the name of the channel that you want to send alert notifications in the following format: #<channelname>.

Both public and private channels are supported.

  • Click Test. If the test is successful, the Slack channel you’re configuring for the notifier outputs Slack setting validated.

Email

  • Enter a Name for the notifier.
  • In the Sender field, enter an email address available on your mail server that you want to send the notification.
  • In the Host field, enter the IP address or host name for your SMTP server. Example: smtp.email.com
  • In the Port field, enter the port used for email. Typically, TLS uses 587 and SSL uses 465. If you’re using TLS, make sure Use TLS is selected.
  • Enter a Username and Password that authenticate with the SMTP server.
  • In the Default Recipient field, enter the email address that you want to receive the notification.
  • Click Test. If the test is successful, Rancher prints settings validated and you receive a test notification email.

PagerDuty

  • Enter a Name for the notifier.
  • From PagerDuty, create a webhook. For instructions, see the PagerDuty Documentation.
  • From PagerDuty, copy the webhook’s Integration Key.
  • From Rancher, enter the key in the Service Key field.
  • Click Test. If the test is successful, your PagerDuty endpoint outputs PageDuty setting validated.

WebHook

  • Enter a Name for the notifier.
  • Using the app of your choice, create a webhook URL.
  • Enter your webhook URL.
  • Click Test. If the test is successfull, the URL you’re configuring as a notifier outputs Webhook setting validated.
  • Click Add to complete adding the notifier.

Result: Your notifier is added to Rancher.

What’s Next?

After creating a notifier, set up alerts to receive notifications of Rancher system events.

Managing Notifiers

After you set up notifiers, you can manage them by selecting Tools > Notifiers from the Global view. You can:

  • Edit their settings that you configured during their initial setup.
  • Clone them, to quickly setup slightly different notifiers.
  • Delete them when they’re no longer necessary.

Alerts

To keep your clusters and applications healthy and driving your organizational productivity forward, you need to stay informed of events occurring in your clusters and projects, both planned and unplanned. To help you stay informed of these events, you can configure alerts.

Alerts are sets of rules, chosen by you, to monitor for specific events. The scope for alerts can be set at either the cluster or project level.

Cluster Alerts vs. Project Alerts

At the cluster level, Rancher monitors components in your Kubernetes cluster, and sends you alerts related to:

  • The state of your nodes.
  • The system services that manage your Kubernetes cluster.
  • The resource events from specific system services.At the project level, Rancher monitors specific deployments and sends alerts for:

  • Deployment availability

  • Workloads status
  • Pod status

Adding Cluster Alerts

As a cluster owner, you can configure Rancher to send you alerts for cluster events.

Prerequisite: Before you can receive cluster alerts, you must add a notifier.

  • From the Global view, open the cluster that you want to configure alerts for.

  • From the main menu, select Tools > Alerts. Then click Add Alert.

  • Enter a Name for the alert that describes its purpose.

  • Based on the type of alert you want to create, complete one of the instruction subsets below.

System Service Alerts

This alert type monitors for events that affect one of the Kubernetes master components, regardless of the node it occurs on.

  • Select the System Services option, and then select an option from the drop-down.

  • Select the urgency level of the of alert. The options are:

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level based on the importance of the service and how many nodes fill the role within your cluster. For example, if you’re making an alert for the etcd service, select Critical. If you’re making an alert for redundant schedulers, Warning is more appropriate.

Resource Event Alerts

This alert type monitors for specific events that are thrown from a resource type.

  • Choose the type of resource event that triggers an alert. The options are:

    • Normal: triggers an alert when any standard resource event occurs.
    • Warning: triggers an alert when unexpected resource events occur.
  • Select a resource type from the Choose a Resource drop-down that you want to trigger an alert.

  • Select the urgency level of the of alert.

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert by considering factors such as how often the event occurs or its importance. For example:

    • If you set a normal alert for pods, you’re likely to receive alerts often, and individual pods usually self-heal, so select an urgency of Info.

    • If you set a warning alert for StatefulSets, its very likely to impact operations, so select an urgency of Critical.

Node Alerts

This alert type monitors for events that occur on a specific node.

  • Select the Node option, and then make a selection from the Choose a Node drop-down.

  • Choose an event to trigger the alert.

    • Not Ready: Sends you an alert when the node is unresponsive.
    • CPU usage over: Sends you an alert when the node raises above an entered percentage of its processing allocation.
    • Mem usuage over: Sends you an alert when the node raises above an entered percentage of its memory allocation.
  • Select the urgency level of the of alert.

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert based on its impact on operations. For example, an alert triggered when a node’s CPU raises above 60% deems a urgency of Info, but a node that is Not Ready deems an urgency of Critical.

Node Selector Alerts

This alert type monitors for events that occur on any node on marked with a label. For more information, see the Kubernetes documentation for Labels.

  • Select the Node Selector option, and then click Add Selector to enter a key value pair for a label. This label should be applied to one or more of your nodes. Add as many selectors as you’d like.

  • Choose an event to trigger the alert.

    • Not Ready: Sends you an alert when selected nodes are unresponsive.
    • CPU usage over: Sends you an alert when selected nodes raise above an entered percentage of processing allocation.
    • Mem usuage over: Sends you an alert when selected nodes raise above an entered percentage of memory allocation.
  • Select the urgency level of the of alert.

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert based on its impact on operations. For example, an alert triggered when a node’s CPU raises above 60% deems a urgency of Info, but a node that is Not Ready deems an urgency of Critical.
  • Finally, choose the notifiers that send you alerts.

    • You can set up multiple notifiers.
    • You can change notifier recipients on the fly.Result: Your alert is configured. A notification is sent when the alert is triggered.

Managing Cluster Alerts

After you set up cluster alerts, you can manage each alert object. To manage alerts, browse to the cluster containing the alerts, and then select Tools > Alerts that you want to manage. You can:

  • Deactivate/Reactive alerts
  • Edit alert settings
  • Delete unnecessary alerts

Adding Project Alerts

Prerequisite: Before you can receive project alerts, you must add a notifier.

  • From the Global view, open the project that you want to configure alerts for.

  • From the main menu, select Resources > Alerts. Then click Add Alert.

  • Enter a Name for the alert that describes its purpose.

  • Based on the type of alert you want to create, complete one of the instruction subsets below.

Pod Alerts

This alert type monitors for the status of a specific pod.

  • Select the Pod option, and then select a pod from the drop-down.
  • Select a pod status that triggers and alert:

    • Not Running
    • Not Scheduled
    • Restarted <x> times with the last <x> Minutes
  • Select the urgency level of the of alert. The options are:

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert based on pod state and expendability. For example, an stateless pod that’s not can be easily replaced, so select Info. However, if an important pod isn’t scheduled, it may affect operations, so choose Critical.

Workload Alerts

This alert type monitors for the availability of a workload.

  • Choose the Workload option. Then choose a workload from the drop-down.

  • Choose an availability percentage using the slider. The alert is triggered when the workload’s availability on your cluster nodes drops below the set percentage.

  • Select the urgency level of the of alert.

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert based on the percentage you choose and the importance of the workload.

Workload Selector Alerts

This alert type monitors for the availability of all workloads marked with tags that you’ve specified.

  • Select the Workload Selector option, and then click Add Selector to enter the key value pair for a label. If one of the workloads drops below your specifications, an alert is triggered. This label should be applied to one or more of your workloads.

  • Select the urgency level of the of alert.

    • Critical: Most urgent
    • Warning: Normal urgency
    • Info: Least urgentSelect the urgency level of the alert based on the percentage you choose and the importance of the workload.
  • Finally, choose the notifiers that send you alerts.

    • You can set up multiple notifiers.
    • You can change notifier recipients on the fly.Result: Your alert is configured. A notification is sent when the alert is triggered.

Managing Project Alerts

To manage project alerts, browse to the project that alerts you want to manage. Then select Resources > Alerts. You can:

  • Deactivate/Reactive alerts
  • Edit alert settings
  • Delete unnecessary alerts