Observability with Prometheus

AttentionThis page documents an earlier version. Go to the latest (v2.1)version.

You can monitor your local YugabyteDB cluster with a local instance of Prometheus, the de-facto standard for time-series monitoring of cloud native infrastructure. Every YugabyteDB service exposes metrics in the Prometheus format at the /prometheus-metrics endpoint.

If you haven’t installed YugabyteDB yet, do so first by following the Quick Start guide.

Prerequisite

Prometheus is installed on your local machine. If you have not done so already, follow the links below.

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

  1. $ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3-node universe with a replication factor of 3.

  1. $ ./bin/yb-ctl create

2. Run sample key-value app

Run a simple key-value workload in a separate shell.

  1. $ java -jar java/yb-sample-apps.jar \
  2. --workload CassandraKeyValue \
  3. --nodes 127.0.0.1:9042 \
  4. --num_threads_read 1 \
  5. --num_threads_write 1

3. Prepare Prometheus config file

Copy the following into a file called yugabytedb.yml.

  1. global:
  2. scrape_interval: 5s # Set the scrape interval to every 5 seconds. Default is every 1 minute.
  3. evaluation_interval: 5s # Evaluate rules every 5 seconds. The default is every 1 minute.
  4. # scrape_timeout is set to the global default (10s).
  5. # YugabyteDB configuration to scrape Prometheus time-series metrics
  6. scrape_configs:
  7. - job_name: 'yugabytedb'
  8. metrics_path: /prometheus-metrics
  9. static_configs:
  10. - targets: ['127.0.0.1:7000', '127.0.0.2:7000', '127.0.0.3:7000']
  11. labels:
  12. group: 'yb-master'
  13. - targets: ['127.0.0.1:9000', '127.0.0.2:9000', '127.0.0.3:9000']
  14. labels:
  15. group: 'yb-tserver'
  16. - targets: ['127.0.0.1:11000', '127.0.0.2:11000', '127.0.0.3:11000']
  17. labels:
  18. group: 'yedis'
  19. - targets: ['127.0.0.1:12000', '127.0.0.2:12000', '127.0.0.3:12000']
  20. labels:
  21. group: 'ycql'
  22. - targets: ['127.0.0.1:13000', '127.0.0.2:13000', '127.0.0.3:13000']
  23. labels:
  24. group: 'ypostgresql'

4. Start Prometheus server

Go to the directory where Prometheus is installed and start the Prometheus server as below.

  1. $ ./prometheus --config.file=yugabytedb.yml

Open the Prometheus UI at http://localhost:9090 and then navigate to the Targets page under Status.

Prometheus Targets

5. Analyze key metrics

On the Prometheus Graph UI, you can now plot the read IOPS and write IOPS for the CassandraKeyValue sample app. As we can see from the source code of the app, it uses only SELECT statements for reads and INSERT statements for writes (aside from the initial CREATE TABLE). This means we can measure throughput and latency by simply using the metrics corresponding to the SELECT and INSERT statements.

Paste the following expressions into the Expression box and click Execute followed by Add Graph.

Throughput

Read IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

Latency

Read Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

  1. $ ./bin/yb-ctl destroy

Prerequisite

Prometheus is installed on your local machine. If you have not done so already, follow the links below.

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

  1. $ ./bin/yb-ctl destroy

Start a new local cluster - by default, this will create a 3-node universe with a replication factor of 3.

  1. $ ./bin/yb-ctl create

2. Run sample key-value app

Run a simple key-value workload in a separate shell.

  1. $ java -jar java/yb-sample-apps.jar \
  2. --workload CassandraKeyValue \
  3. --nodes 127.0.0.1:9042 \
  4. --num_threads_read 1 \
  5. --num_threads_write 1

3. Prepare Prometheus config file

Copy the following into a file called yugabytedb.yml.

  1. global:
  2. scrape_interval: 5s # Set the scrape interval to every 5 seconds. Default is every 1 minute.
  3. evaluation_interval: 5s # Evaluate rules every 5 seconds. The default is every 1 minute.
  4. # scrape_timeout is set to the global default (10s).
  5. # YugabyteDB configuration to scrape Prometheus time-series metrics
  6. scrape_configs:
  7. - job_name: 'yugabytedb'
  8. metrics_path: /prometheus-metrics
  9. static_configs:
  10. - targets: ['127.0.0.1:7000', '127.0.0.2:7000', '127.0.0.3:7000']
  11. labels:
  12. group: 'yb-master'
  13. - targets: ['127.0.0.1:9000', '127.0.0.2:9000', '127.0.0.3:9000']
  14. labels:
  15. group: 'yb-tserver'
  16. - targets: ['127.0.0.1:11000', '127.0.0.2:11000', '127.0.0.3:11000']
  17. labels:
  18. group: 'yedis'
  19. - targets: ['127.0.0.1:12000', '127.0.0.2:12000', '127.0.0.3:12000']
  20. labels:
  21. group: 'ycql'
  22. - targets: ['127.0.0.1:13000', '127.0.0.2:13000', '127.0.0.3:13000']
  23. labels:
  24. group: 'ypostgresql'

4. Start Prometheus server

Go to the directory where Prometheus is installed and start the Prometheus server as below.

  1. $ ./prometheus --config.file=yugabytedb.yml

Open the Prometheus UI at http://localhost:9090 and then navigate to the Targets page under Status.

Prometheus Targets

5. Analyze key metrics

On the Prometheus Graph UI, you can now plot the read IOPS and write IOPS for the CassandraKeyValue sample app. As we can see from the source code of the app, it uses only SELECT statements for reads and INSERT statements for writes (aside from the initial CREATE TABLE). This means we can measure throughput and latency by simply using the metrics corresponding to the SELECT and INSERT statements.

Paste the following expressions into the Expression box and click Execute followed by Add Graph.

Throughput

Read IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

Latency

Read Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

  1. $ ./bin/yb-ctl destroy

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

  1. $ ./yb-docker-ctl destroy

Start a new local universe with default replication factor 3.

  1. $ ./yb-docker-ctl create

2. Run sample key-value app

Run a simple key-value workload in a separate shell.

  1. $ docker cp yb-master-n1:/home/yugabyte/java/yb-sample-apps.jar .
  1. $ java -jar ./yb-sample-apps.jar --workload CassandraKeyValue \
  2. --nodes localhost:9042 \
  3. --num_threads_write 1 \
  4. --num_threads_read 4 \
  5. --value_size 4096

3. Prepare Prometheus config file

Copy the following into a file called yugabytedb.yml. Move this file to the /tmp directory so that we can bind the file to the Prometheus container later on.

  1. global:
  2. scrape_interval: 5s # Set the scrape interval to every 5 seconds. Default is every 1 minute.
  3. evaluation_interval: 5s # Evaluate rules every 5 seconds. The default is every 1 minute.
  4. # scrape_timeout is set to the global default (10s).
  5. # YugabyteDB configuration to scrape Prometheus time-series metrics
  6. scrape_configs:
  7. - job_name: 'yugabytedb'
  8. metrics_path: /prometheus-metrics
  9. static_configs:
  10. - targets: ['yb-master-n1:7000', 'yb-master-n2:7000', 'yb-master-n3:7000']
  11. labels:
  12. group: 'yb-master'
  13. - targets: ['yb-tserver-n1:9000', 'yb-tserver-n2:9000', 'yb-tserver-n3:9000']
  14. labels:
  15. group: 'yb-tserver'
  16. - targets: ['yb-tserver-n1:11000', 'yb-tserver-n2:11000', 'yb-tserver-n3:11000']
  17. labels:
  18. group: 'yedis'
  19. - targets: ['yb-tserver-n1:12000', 'yb-tserver-n2:12000', 'yb-tserver-n3:12000']
  20. labels:
  21. group: 'ycql'
  22. - targets: ['yb-tserver-n1:13000', 'yb-tserver-n2:13000', 'yb-tserver-n3:13000']
  23. labels:
  24. group: 'ypostgresql'

4. Start Prometheus server

Start the Prometheus server as below. The prom/prometheus container image will be pulled from the Docker registry if not already present on the localhost.

  1. $ docker run \
  2. -p 9090:9090 \
  3. -v /tmp/yugabytedb.yml:/etc/prometheus/prometheus.yml \
  4. --net yb-net \
  5. prom/prometheus

Open the Prometheus UI at http://localhost:9090 and then navigate to the Targets page under Status.

Prometheus Targets

5. Analyze key metrics

On the Prometheus Graph UI, you can now plot the read/write throughput and latency for the CassandraKeyValue sample app. As we can see from the source code of the app, it uses only SELECT statements for reads and INSERT statements for writes (aside from the initial CREATE TABLE). This means we can measure throughput and latency by simply using the metrics corresponding to the SELECT and INSERT statements.

Paste the following expressions into the Expression box and click Execute followed by Add Graph.

Throughput

Read IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write IOPS

  1. sum(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

Latency

Read Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_SelectStmt_count[1m]))

Prometheus Read IOPS

Write Latency (in microseconds)

  1. avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_sum[1m])) / avg(irate(handler_latency_yb_cqlserver_SQLProcessor_InsertStmt_count[1m]))

Prometheus Read IOPS

6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

  1. $ ./yb-docker-ctl destroy

<!—

1. Setup - create universe

If you have a previously running local universe, destroy it using the following.

  1. $ kubectl delete -f yugabyte-statefulset.yaml

Start a new local cluster - by default, this will create a 3 node universe with a replication factor of 3.

  1. $ kubectl apply -f yugabyte-statefulset.yaml

Step 6. Clean up (optional)

Optionally, you can shutdown the local cluster created in Step 1.

  1. $ kubectl delete -f yugabyte-statefulset.yaml

—>