Writing client libraries

This document covers what functionality and API Prometheus client librariesshould offer, with the aim of consistency across libraries, making the easy usecases easy and avoiding offering functionality that may lead users down thewrong path.

There are 10 languages already supported atthe time of writing, so we’ve gotten a good sense by now of how to write aclient. These guidelines aim to help authors of new client libraries producegood libraries.

Conventions

MUST/MUST NOT/SHOULD/SHOULD NOT/MAY have the meanings given inhttps://www.ietf.org/rfc/rfc2119.txt

In addition ENCOURAGED means that a feature is desirable for a library to have,but it’s okay if it’s not present. In other words, a nice to have.

Things to keep in mind:

  • Take advantage of each language’s features.

  • The common use cases should be easy.

  • The correct way to do something should be the easy way.

  • More complex use cases should be possible.

The common use cases are (in order):

  • Counters without labels spread liberally around libraries/applications.

  • Timing functions/blocks of code in Summaries/Histograms.

  • Gauges to track current states of things (and their limits).

  • Monitoring of batch jobs.

Overall structure

Clients MUST be written to be callback based internally. Clients SHOULDgenerally follow the structure described here.

The key class is the Collector. This has a method (typically called ‘collect’)that returns zero or more metrics and their samples. Collectors get registeredwith a CollectorRegistry. Data is exposed by passing a CollectorRegistry to aclass/method/function "bridge", which returns the metrics in a formatPrometheus supports. Every time the CollectorRegistry is scraped it mustcallback to each of the Collectors’ collect method.

The interface most users interact with are the Counter, Gauge, Summary, andHistogram Collectors. These represent a single metric, and should cover thevast majority of use cases where a user is instrumenting their own code.

More advanced uses cases (such as proxying from anothermonitoring/instrumentation system) require writing a custom Collector. Someonemay also want to write a "bridge" that takes a CollectorRegistry and producesdata in a format a different monitoring/instrumentation system understands,allowing users to only have to think about one instrumentation system.

CollectorRegistry SHOULD offer register()/unregister() functions, and aCollector SHOULD be allowed to be registered to multiple CollectorRegistrys.

Client libraries MUST be thread safe.

For non-OO languages such as C, client libraries should follow the spirit ofthis structure as much as is practical.

Naming

Client libraries SHOULD follow function/method/class names mentioned in thisdocument, keeping in mind the naming conventions of the language they’reworking in. For example, set_to_current_time() is good for a method namePython, but SetToCurrentTime() is better in Go and setToCurrentTime() isthe convention in Java. Where names differ for technical reasons (e.g. notallowing function overloading), documentation/help strings SHOULD point userstowards the other names.

Libraries MUST NOT offer functions/methods/classes with the same or similarnames to ones given here, but with different semantics.

Metrics

The Counter, Gauge, Summary and Histogram metrictypes are the primary interface by users.

Counter and Gauge MUST be part of the client library. At least one of Summaryand Histogram MUST be offered.

These should be primarily used as file-static variables, that is, globalvariables defined in the same file as the code they’re instrumenting. Theclient library SHOULD enable this. The common use case is instrumenting a pieceof code overall, not a piece of code in the context of one instance of anobject. Users shouldn’t have to worry about plumbing their metrics throughouttheir code, the client library should do that for them (and if it doesn’t,users will write a wrapper around the library to make it "easier" - whichrarely tends to go well).

There MUST be a default CollectorRegistry, the standard metrics MUST by defaultimplicitly register into it with no special work required by the user. ThereMUST be a way to have metrics not register to the default CollectorRegistry,for use in batch jobs and unittests. Custom collectors SHOULD also follow this.

Exactly how the metrics should be created varies by language. For some (Java,Go) a builder approach is best, whereas for others (Python) function argumentsare rich enough to do it in one call.

For example in the Java Simpleclient we have:

  1. class YourClass {
  2. static final Counter requests = Counter.build()
  3. .name("requests_total")
  4. .help("Requests.").register();
  5. }

This will register requests with the default CollectorRegistry. By callingbuild() rather than register() the metric won’t be registered (handy forunittests), you can also pass in a CollectorRegistry to register() (handy forbatch jobs).

Counter

Counter is a monotonically increasingcounter. It MUST NOT allow the value to decrease, however it MAY be reset to 0(such as by server restart).

A counter MUST have the following methods:

  • inc(): Increment the counter by 1
  • inc(double v): Increment the counter by the given amount. MUST check that v >= 0.A counter is ENCOURAGED to have:

A way to count exceptions throw/raised in a given piece of code, and optionallyonly certain types of exceptions. This is count_exceptions in Python.

Counters MUST start at 0.

Gauge

Gauge represents a value that can go upand down.

A gauge MUST have the following methods:

  • inc(): Increment the gauge by 1
  • inc(double v): Increment the gauge by the given amount
  • dec(): Decrement the gauge by 1
  • dec(double v): Decrement the gauge by the given amount
  • set(double v): Set the gauge to the given valueGauges MUST start at 0, you MAY offer a way for a given gauge to start at adifferent number.

A gauge SHOULD have the following methods:

  • set_to_current_time(): Set the gauge to the current unixtime in seconds.A gauge is ENCOURAGED to have:

A way to track in-progress requests in some piece of code/function. This istrack_inprogress in Python.

A way to time a piece of code and set the gauge to its duration in seconds.This is useful for batch jobs. This is startTimer/setDuration in Java and thetime() decorator/context manager in Python. This SHOULD match the pattern inSummary/Histogram (though set() rather than observe()).

Summary

A summary samples observations (usuallythings like request durations) over sliding windows of time and providesinstantaneous insight into their distributions, frequencies, and sums.

A summary MUST NOT allow the user to set "quantile" as a label name, as this isused internally to designate summary quantiles. A summary is ENCOURAGED tooffer quantiles as exports, though these can’t be aggregated and tend to beslow. A summary MUST allow not having quantiles, as just _count/_sum isquite useful and this MUST be the default.

A summary MUST have the following methods:

  • observe(double v): Observe the given amountA summary SHOULD have the following methods:

Some way to time code for users in seconds. In Python this is the time()decorator/context manager. In Java this is startTimer/observeDuration. Unitsother than seconds MUST NOT be offered (if a user wants something else, theycan do it by hand). This should follow the same pattern as Gauge/Histogram.

Summary _count/_sum MUST start at 0.

Histogram

Histograms allow aggregatabledistributions of events, such as request latencies. This is at its core acounter per bucket.

A histogram MUST NOT allow le as a user-set label, as le is used internallyto designate buckets.

A histogram MUST offer a way to manually choose the buckets. Ways to setbuckets in a linear(start, width, count) and exponential(start, factor,count) fashion SHOULD be offered. Count MUST exclude the +Inf bucket.

A histogram SHOULD have the same default buckets as other client libraries.Buckets MUST NOT be changeable once the metric is created.

A histogram MUST have the following methods:

  • observe(double v): Observe the given amountA histogram SHOULD have the following methods:

Some way to time code for users in seconds. In Python this is the time()decorator/context manager. In Java this is startTimer/observeDuration.Units other than seconds MUST NOT be offered (if a user wants something else,they can do it by hand). This should follow the same pattern as Gauge/Summary.

Histogram _count/_sum and the buckets MUST start at 0.

Further metrics considerations

Providing additional functionality in metrics beyond what’s documented above asmakes sense for a given language is ENCOURAGED.

If there’s a common use case you can make simpler then go for it, as long as itwon’t encourage undesirable behaviours (such as suboptimal metric/labellayouts, or doing computation in the client).

Labels

Labels are one of the most powerfulaspects of Prometheus, buteasily abused.Accordingly client libraries must be very careful in how labels are offered tousers.

Client libraries MUST NOT under any circumstances allow users to have differentlabel names for the same metric for Gauge/Counter/Summary/Histogram or anyother Collector offered by the library.

Metrics from custom collectors should almost always have consistent labelnames. As there are still rare but valid use cases where this is not the case,client libraries should not verify this.

While labels are powerful, the majority of metrics will not have labels.Accordingly the API should allow for labels but not dominate it.

A client library MUST allow for optionally specifying a list of label names atGauge/Counter/Summary/Histogram creation time. A client library SHOULD supportany number of label names. A client library MUST validate that label names meetthe documentedrequirements.

The general way to provide access to labeled dimension of a metric is via alabels() method that takes either a list of the label values or a map fromlabel name to label value and returns a "Child". The usual.inc()/.dec()/.observe() etc. methods can then be called on the Child.

The Child returned by labels() SHOULD be cacheable by the user, to avoidhaving to look it up again - this matters in latency-critical code.

Metrics with labels SHOULD support a remove() method with the same signatureas labels() that will remove a Child from the metric no longer exporting it,and a clear() method that removes all Children from the metric. Theseinvalidate caching of Children.

There SHOULD be a way to initialize a given Child with the default value,usually just calling labels(). Metrics without labels MUST always beinitialized to avoid problems with missingmetrics.

Metric names

Metric names must follow thespecification. As withlabel names, this MUST be met for uses of Gauge/Counter/Summary/Histogram andin any other Collector offered with the library.

Many client libraries offer setting the name in three parts:namespace_subsystem_name of which only the name is mandatory.

Dynamic/generated metric names or subparts of metric names MUST be discouraged,except when a custom Collector is proxying from otherinstrumentation/monitoring systems. Generated/dynamic metric names are a signthat you should be using labels instead.

Metric description and help

Gauge/Counter/Summary/Histogram MUST require metric descriptions/help to beprovided.

Any custom Collectors provided with the client libraries MUST havedescriptions/help on their metrics.

It is suggested to make it a mandatory argument, but not to check that it’s ofa certain length as if someone really doesn’t want to write docs we’re notgoing to convince them otherwise. Collectors offered with the library (andindeed everywhere we can within the ecosystem) SHOULD have good metricdescriptions, to lead by example.

Exposition

Clients MUST implement the text-based exposition format outlined in theexposition formats documentation.

Reproducible order of the exposed metrics is ENCOURAGED (especially for humanreadable formats) if it can be implemented without a significant resource cost.

Standard and runtime collectors

Client libraries SHOULD offer what they can of the Standard exports, documentedbelow.

These SHOULD be implemented as custom Collectors, and registered by default onthe default CollectorRegistry. There SHOULD be a way to disable these, as thereare some very niche use cases where they get in the way.

Process metrics

These metrics have the prefix process_. If obtaining a necessary value isproblematic or even impossible with the used language or runtime, clientlibraries SHOULD prefer leaving out the corresponding metric over exportingbogus, inaccurate, or special values (like NaN). All memory values in bytes,all times in unixtime/seconds.

Metric nameHelp stringUnit
process_cpu_seconds_totalTotal user and system CPU time spent in seconds.seconds
process_open_fdsNumber of open file descriptors.file descriptors
process_max_fdsMaximum number of open file descriptors.file descriptors
process_virtual_memory_bytesVirtual memory size in bytes.bytes
process_virtual_memory_max_bytesMaximum amount of virtual memory available in bytes.bytes
process_resident_memory_bytesResident memory size in bytes.bytes
process_heap_bytesProcess heap size in bytes.bytes
process_start_time_secondsStart time of the process since unix epoch in seconds.seconds

Runtime metrics

In addition, client libraries are ENCOURAGED to also offer whatever makes sensein terms of metrics for their language’s runtime (e.g. garbage collectionstats), with an appropriate prefix such as go, hotspot etc.

Unit tests

Client libraries SHOULD have unit tests covering the core instrumentationlibrary and exposition.

Client libraries are ENCOURAGED to offer ways that make it easy for users tounit-test their use of the instrumentation code. For example, theCollectorRegistry.get_sample_value in Python.

Packaging and dependencies

Ideally, a client library can be included in any application to add someinstrumentation without breaking the application.

Accordingly, caution is advised when adding dependencies to the client library.For example, if you add a library that uses a Prometheus client that requiresversion x.y of a library but the application uses x.z elsewhere, will that havean adverse impact on the application?

It is suggested that where this may arise, that the core instrumentation isseparated from the bridges/exposition of metrics in a given format. Forexample, the Java simpleclient simpleclient module has no dependencies, andthe simpleclient_servlet has the HTTP bits.

Performance considerations

As client libraries must be thread-safe, some form of concurrency control isrequired and consideration must be given to performance on multi-core machinesand applications.

In our experience the least performant is mutexes.

Processor atomic instructions tend to be in the middle, and generallyacceptable.

Approaches that avoid different CPUs mutating the same bit of RAM work best,such as the DoubleAdder in Java’s simpleclient. There is a memory cost though.

As noted above, the result of labels() should be cacheable. The concurrentmaps that tend to back metric with labels tend to be relatively slow.Special-casing metrics without labels to avoid labels()-like lookups can helpa lot.

Metrics SHOULD avoid blocking when they are being incremented/decremented/setetc. as it’s undesirable for the whole application to be held up while a scrapeis ongoing.

Having benchmarks of the main instrumentation operations, including labels, isENCOURAGED.

Resource consumption, particularly RAM, should be kept in mind when performingexposition. Consider reducing the memory footprint by streaming results, andpotentially having a limit on the number of concurrent scrapes.