Frequently Asked Questions

General

What is Prometheus?

Prometheus is an open-source systems monitoring and alerting toolkitwith an active ecosystem. See the overview.

How does Prometheus compare against other monitoring systems?

See the comparison page.

What dependencies does Prometheus have?

The main Prometheus server runs standalone and has no external dependencies.

Can Prometheus be made highly available?

Yes, run identical Prometheus servers on two or more separate machines.Identical alerts will be deduplicated by the Alertmanager.

For high availability of the Alertmanager,you can run multiple instances in aMesh cluster and configure the Prometheusservers to send notifications to each of them.

I was told Prometheus “doesn't scale”.

There are in fact various ways to scale and federatePrometheus. Read Scaling and Federating Prometheuson the Robust Perception blog to get started.

What language is Prometheus written in?

Most Prometheus components are written in Go. Some are also written in Java,Python, and Ruby.

How stable are Prometheus features, storage formats, and APIs?

All repositories in the Prometheus GitHub organization that have reachedversion 1.0.0 broadly followsemantic versioning. Breaking changes are indicated byincrements of the major version. Exceptions are possible for experimentalcomponents, which are clearly marked as such in announcements.

Even repositories that have not yet reached version 1.0.0 are, in general, quitestable. We aim for a proper release process and an eventual 1.0.0 release foreach repository. In any case, breaking changes will be pointed out in releasenotes (marked by [CHANGE]) or communicated clearly for components that do nothave formal releases yet.

Why do you pull rather than push?

Pulling over HTTP offers a number of advantages:

  • You can run your monitoring on your laptop when developing changes.
  • You can more easily tell if a target is down.
  • You can manually go to a target and inspect its health with a web browser.Overall, we believe that pulling is slightly better than pushing, but it shouldnot be considered a major point when considering a monitoring system.

For cases where you must push, we offer the Pushgateway.

How to feed logs into Prometheus?

Short answer: Don't! Use something like the ELK stack instead.

Longer answer: Prometheus is a system to collect and process metrics, not anevent logging system. The Raintank blog postLogs and Metrics and Graphs, Oh My!provides more details about the differences between logs and metrics.

If you want to extract Prometheus metrics from application logs, Google'smtail might be helpful.

Who wrote Prometheus?

Prometheus was initially started privately byMatt T. Proud andJulius Volz. The majority of itsinitial development was sponsored by SoundCloud.

It's now maintained and extended by a wide range of companies and individuals.

What license is Prometheus released under?

Prometheus is released under theApache 2.0 license.

What is the plural of Prometheus?

After extensive research, it has been determinedthat the correct plural of 'Prometheus' is 'Prometheis'.

Can I reload Prometheus's configuration?

Yes, sending SIGHUP to the Prometheus process or an HTTP POST request to the/-/reload endpoint will reload and apply the configuration file. Thevarious components attempt to handle failing changes gracefully.

Can I send alerts?

Yes, with the Alertmanager.

Currently, the following external systems are supported:

Can I create dashboards?

Yes, we recommend Grafana for productionusage. There are also Console templates.

Can I change the timezone? Why is everything in UTC?

To avoid any kind of timezone confusion, especially when the so-calleddaylight saving time is involved, we decided to exclusively use Unixtime internally and UTC for display purposes in all components ofPrometheus. A carefully done timezone selection could be introducedinto the UI. Contributions are welcome. Seeissue #500for the current state of this effort.

Instrumentation

Which languages have instrumentation libraries?

There are a number of client libraries for instrumenting your services withPrometheus metrics. See the client librariesdocumentation for details.

If you are interested in contributing a client library for a new language, seethe exposition formats.

Can I monitor machines?

Yes, the Node Exporter exposesan extensive set of machine-level metrics on Linux and other Unix systems suchas CPU usage, memory, disk utilization, filesystem fullness, and networkbandwidth.

Can I monitor network devices?

Yes, the SNMP Exporter allowsmonitoring of devices that support SNMP.

Can I monitor batch jobs?

Yes, using the Pushgateway. See also thebest practices for monitoring batchjobs.

What applications can Prometheus monitor out of the box?

See the list of exporters and integrations.

Can I monitor JVM applications via JMX?

Yes, for applications that you cannot instrument directly with the Java client, you can use the JMX Exportereither standalone or as a Java Agent.

What is the performance impact of instrumentation?

Performance across client libraries and languages may vary. For Java,benchmarksindicate that incrementing a counter/gauge with the Java client will take12-17ns, depending on contention. This is negligible for all but the mostlatency-critical code.

Troubleshooting

My Prometheus 1.x server takes a long time to start up and spams the log with copious information about crash recovery.

You are suffering from an unclean shutdown. Prometheus has to shut down cleanlyafter a SIGTERM, which might take a while for heavily used servers. If theserver crashes or is killed hard (e.g. OOM kill by the kernel or your runlevelsystem got impatient while waiting for Prometheus to shutdown), a crashrecovery has to be performed, which should take less than a minute under normalcircumstances, but can take quite long under certain circumstances. Seecrash recovery for details.

My Prometheus 1.x server runs out of memory.

See the section about memory usageto configure Prometheus for the amount of memory you have available.

My Prometheus 1.x server reports to be in “rushed mode” or that “storage needs throttling”.

Your storage is under heavy load. Readthe section about configuring the local storageto find out how you can tweak settings for better performance.

Implementation

Why are all sample values 64-bit floats? I want integers.

We restrained ourselves to 64-bit floats to simplify the design. TheIEEE 754 double-precision binary floating-pointformatsupports integer precision for values up to 253. Supportingnative 64 bit integers would (only) help if you need integer precisionabove 253 but below 263. In principle, supportfor different sample value types (including some kind of big integer,supporting even more than 64 bit) could be implemented, but it is nota priority right now. A counter, even if incremented one million times persecond, will only run into precision issues after over 285 years.

Why don't the Prometheus server components support TLS or authentication? Can I add those?

Note: The Prometheus team has changed their stance on this during its development summit onAugust 11, 2018, and support for TLS and authentication in serving endpoints is now on theproject's roadmap.This document will be updated once code changes have been made.

While TLS and authentication are frequently requested features, we haveintentionally not implemented them in any of Prometheus's server-sidecomponents. There are so many different options and parameters for both (10+options for TLS alone) that we have decided to focus on building the bestmonitoring system possible rather than supporting fully generic TLS andauthentication solutions in every server component.

If you need TLS or authentication, we recommend putting a reverse proxy infront of Prometheus. See, for example Adding Basic Auth to Prometheus withNginx.

This applies only to inbound connections. Prometheus does supportscraping TLS- and auth-enabled targets, and otherPrometheus components that create outbound connections have similar support.