Overview

What is Prometheus?

Prometheus is an open-source systemsmonitoring and alerting toolkit originally built atSoundCloud. Since its inception in 2012, manycompanies and organizations have adopted Prometheus, and the project has a veryactive developer and user community. It is now a standalone open source projectand maintained independently of any company. To emphasize this, and to clarifythe project's governance structure, Prometheus joined theCloud Native Computing Foundation in 2016as the second hosted project, after Kubernetes.

For more elaborate overviews of Prometheus, see the resources linked from themedia section.

Features

Prometheus's main features are:

  • a multi-dimensional data model with time series data identified by metric name and key/value pairs
  • PromQL, a flexible query languageto leverage this dimensionality
  • no reliance on distributed storage; single server nodes are autonomous
  • time series collection happens via a pull model over HTTP
  • pushing time series is supported via an intermediary gateway
  • targets are discovered via service discovery or static configuration
  • multiple modes of graphing and dashboarding support

Components

The Prometheus ecosystem consists of multiple components, many of which areoptional:

  • the main Prometheus server which scrapes and stores time series data
  • client libraries for instrumenting application code
  • a push gateway for supporting short-lived jobs
  • special-purpose exporters for services like HAProxy, StatsD, Graphite, etc.
  • an alertmanager to handle alerts
  • various support toolsMost Prometheus components are written in Go, makingthem easy to build and deploy as static binaries.

Architecture

This diagram illustrates the architecture of Prometheus and some ofits ecosystem components:

Prometheus architecture

Prometheus scrapes metrics from instrumented jobs, either directly or via anintermediary push gateway for short-lived jobs. It stores all scraped sampleslocally and runs rules over this data to either aggregate and record new timeseries from existing data or generate alerts. Grafana orother API consumers can be used to visualize the collected data.

When does it fit?

Prometheus works well for recording any purely numeric time series. It fitsboth machine-centric monitoring as well as monitoring of highly dynamicservice-oriented architectures. In a world of microservices, its support formulti-dimensional data collection and querying is a particular strength.

Prometheus is designed for reliability, to be the system you go toduring an outage to allow you to quickly diagnose problems. Each Prometheusserver is standalone, not depending on network storage or other remote services.You can rely on it when other parts of your infrastructure are broken, andyou do not need to setup extensive infrastructure to use it.

When does it not fit?

Prometheus values reliability. You can always view what statistics areavailable about your system, even under failure conditions. If you need 100%accuracy, such as for per-request billing, Prometheus is not a good choice asthe collected data will likely not be detailed and complete enough. In such acase you would be best off using some other system to collect and analyze thedata for billing, and Prometheus for the rest of your monitoring.