Federation

Federation allows a Prometheus server to scrape selected time series fromanother Prometheus server.

Use cases

There are different use cases for federation. Commonly, it is used to eitherachieve scalable Prometheus monitoring setups or to pull related metrics fromone service's Prometheus into another.

Hierarchical federation

Hierarchical federation allows Prometheus to scale to environments with tens ofdata centers and millions of nodes. In this use case, the federation topologyresembles a tree, with higher-level Prometheus servers collecting aggregatedtime series data from a larger number of subordinated servers.

For example, a setup might consist of many per-datacenter Prometheus serversthat collect data in high detail (instance-level drill-down), and a set ofglobal Prometheus servers which collect and store only aggregated data(job-level drill-down) from those local servers. This provides an aggregateglobal view and detailed local views.

Cross-service federation

In cross-service federation, a Prometheus server of one service is configuredto scrape selected data from another service's Prometheus server to enablealerting and queries against both datasets within a single server.

For example, a cluster scheduler running multiple services might exposeresource usage information (like memory and CPU usage) about service instancesrunning on the cluster. On the other hand, a service running on that clusterwill only expose application-specific service metrics. Often, these two sets ofmetrics are scraped by separate Prometheus servers. Using federation, thePrometheus server containing service-level metrics may pull in the clusterresource usage metrics about its specific service from the cluster Prometheus,so that both sets of metrics can be used within that server.

Configuring federation

On any given Prometheus server, the /federate endpoint allows retrieving thecurrent value for a selected set of time series in that server. At least onematch[] URL parameter must be specified to select the series to expose. Eachmatch[] argument needs to specify aninstant vector selector likeup or {job="api-server"}. If multiple match[] parameters are provided,the union of all matched series is selected.

To federate metrics from one server to another, configure your destinationPrometheus server to scrape from the /federate endpoint of a source server,while also enabling the honor_labels scrape option (to not overwrite anylabels exposed by the source server) and passing in the desired match[]parameters. For example, the following scrape_configs federates any serieswith the label job="prometheus" or a metric name starting with job: fromthe Prometheus servers at source-prometheus-{1,2,3}:9090 into the scrapingPrometheus:

  1. scrape_configs:
  2. - job_name: 'federate'
  3. scrape_interval: 15s
  4. honor_labels: true
  5. metrics_path: '/federate'
  6. params:
  7. 'match[]':
  8. - '{job="prometheus"}'
  9. - '{__name__=~"job:.*"}'
  10. static_configs:
  11. - targets:
  12. - 'source-prometheus-1:9090'
  13. - 'source-prometheus-2:9090'
  14. - 'source-prometheus-3:9090'