JMX monitoring

The JMX monitoring feature exposes query metrics via the JMX API.

Setup

Enable collecting stats

By default, Collecting stats is enabled. You can disable collecting stats via the CrateDB configuration file or by running this statement:

  1. cr> SET GLOBAL "stats.enabled" = FALSE;

Enable the JMX API

To monitor CrateDB using the JMX API, you must set the following system properties before you start CrateDB:

  1. com.sun.management.jmxremote
  2. com.sun.management.jmxremote.port=<JMX_PORT>
  3. com.sun.management.jmxremote.ssl=false
  4. com.sun.management.jmxremote.authenticate=false

Here, <JMX_PORT> sets the port number of your JMX server. JMX SSL and authentication are currently not supported.

More information about the JMX monitoring properties can be found in the JMX documentation.

You can set the Java system properties with the -D option:

  1. sh$ ./bin/crate -Dcom.sun.management.jmxremote \
  2. ... -Dcom.sun.management.jmxremote.port=7979 \
  3. ... -Dcom.sun.management.jmxremote.ssl=false \
  4. ... -Dcom.sun.management.jmxremote.authenticate=false

However, the recommended way to set system properties is via the CRATE_JAVA_OPTS environment variable, like so:

  1. sh$ export CRATE_JAVA_OPTS="$CRATE_JAVA_OPTS \
  2. -Dcom.sun.management.jmxremote \
  3. -Dcom.sun.management.jmxremote.port=7979 \
  4. -Dcom.sun.management.jmxremote.ssl=false \
  5. -Dcom.sun.management.jmxremote.authenticate=false"
  6. sh$ ./bin/crate

If you’re using the CrateDB Debian or RPM packages, you can set this environment variable via the /etc/default/crate configuration file.

Using Docker

To enable JMX monitoring when running CrateDB in a Docker container you have to set the following additional Java system properties:

  1. -Djava.rmi.server.hostname=<RMI_HOSTNAME>
  2. -Dcom.sun.management.jmxremote.rmi.port=<RMI_PORT>

Here, <RMI_HOSTNAME> is the IP address or hostname of the Docker host and <RMI_PORT> is the statically assigned port of the RMI server. For convenience, <RMI_PORT> can be set to the same port the JMX server listens on.

The <RMI_HOSTNAME> and <RMI_PORT> can be used by JMX clients (e.g. JConsole or VisualVM) to connect to the JMX server.

Here’s an example Docker command:

  1. sh> docker run -d -e CRATE_JAVA_OPTS="\
  2. -Dcom.sun.management.jmxremote
  3. -Dcom.sun.management.jmxremote.port=7979 \
  4. -Dcom.sun.management.jmxremote.ssl=false \
  5. -Dcom.sun.management.jmxremote.authenticate=false \
  6. -Dcom.sun.management.jmxremote.rmi.port=7979 \
  7. -Djava.rmi.server.hostname=<RMI_HOSTNAME>" \
  8. -p 7979:7979 crate -Cnetwork.host=_site_

Here, again, <RMI_HOSTNAME> is the IP address or hostname of the Docker host.

JMX Beans

QueryStats MBean

The QueryStats MBean exposes the sum of durations, in milliseconds, total and failed count of all statements executed since the node was started, grouped by type, for SELECT, UPDATE, DELETE, INSERT, MANAGEMENT, DDL, COPY and UNDEFINED queries.

Metrics can be accessed using the JMX MBean object name io.crate.monitoring:type=QueryStats and the following attributes:

Statements total count since the node was started:

  • SelectQueryTotalCount

  • InsertQueryTotalCount

  • UpdateQueryTotalCount

  • DeleteQueryTotalCount

  • ManagementQueryTotalCount

  • DDLQueryTotalCount

  • CopyQueryTotalCount

  • UndefinedQueryTotalCount

Statements failed count since the node was started:

  • SelectQueryFailedCount

  • InsertQueryFailedCount

  • UpdateQueryFailedCount

  • DeleteQueryFailedCount

  • ManagementQueryFailedCount

  • DDLQueryFailedCount

  • CopyQueryFailedCount

  • UndefinedQueryFailedCount

The sum of the durations, in milliseconds, since the node was started, of all statement executions grouped by type:

  • SelectQuerySumOfDurations

  • InsertQuerySumOfDurations

  • UpdateQuerySumOfDurations

  • DeleteQuerySumOfDurations

  • ManagementQuerySumOfDurations

  • DDLQuerySumOfDurations

  • CopyQuerySumOfDurations

  • UndefinedQuerySumOfDurations

NodeStatus MBean

The NodeStatus JMX MBean exposes the status of the current node as boolean values.

NodeStatus can be accessed using the JMX MBean object name io.crate.monitoring:type=NodeStatus and the following attributes:

  • Ready

    Defines if the node is able to process SQL statements.

NodeInfo MXBean

The NodeInfo JMX MXBean exposes information about the current node.

NodeInfo can be accessed using the JMX MXBean object name io.crate.monitoring:type=NodeInfo and the following attributes:

Name

Description

NodeId

Provides the unique identifier of the node in the cluster.

NodeName

Provides the human friendly name of the node.

ClusterStateVersion

Provides the version of the current applied cluster state.

ShardStats

Statistics about the number of shards located on the node.

ShardInfo

Detailed information about the shards located on the node.

ShardStats returns a CompositeData object containing statistics about the number of shards located on the node with the following attributes:

Name

Description

Total

The number of shards located on the node.

Primaries

The number of primary shards located on the node.

Replicas

The number of replica shards located on the node.

Unassigned

The number of unassigned shards in the cluster. If the node is the elected master node in the cluster, this will show the total number of unassigned shards in the cluster, otherwise 0.

ShardInfo returns an Array of CompositeData objects containing detailed information about the shards located on the node with the following attributes:

Name

Description

Id

The shard id. This shard id is managed by the system, ranging from 0 up to the number of configured shards of the table.

Table

The name of the table this shard belongs to.

PartitionIdent

The partition ident of a partitioned table. Empty for non-partitioned tables.

RoutingState

The current state of the shard in the routing table. Possible states are:

  • UNASSIGNED

  • INITIALIZING

  • STARTED

  • RELOCATING

State

The current state of the shard. Possible states are:

  • CREATED

  • RECOVERING

  • POST_RECOVERY

  • STARTED

  • RELOCATED

  • CLOSED

  • INITIALIZING

  • UNASSIGNED

Size

The estimated cumulated size in bytes of all files of this shard.

Connections MBean

The Connections MBean exposes information about any open connections to a CrateDB node.

It can be accessed using the io.crate.monitoring:type=Connections object name and has the following attributes:

Name

Description

HttpOpen

The number of currently established connections via HTTP

HttpTotal

The number of total connections established via HTTP over the life time of a node

PsqlOpen

The number of currently established connections via the PostgreSQL protocol

PsqlTotal

The number of total connections established via the PostgreSQL protocol over the life time of a node

TransportOpen

The number of currently established connections via the transport protocol

ThreadPools MXBean

The ThreadPools MXBean exposes statistical information about the used thread pools of a CrateDB node.

It can be accessed using the io.crate.monitoring:type=ThreadPools object name and has following attributes:

Name

Description

Generic

Thread pool statistics of the generic thread pool.

Search

Thread pool statistics of the search thread pool used by read statements on user generated tables.

Write

Thread pool statistics of the write thread pool used for writing and deleting data.

Management

Thread pool statistics of the management thread pool used by management tasks like stats collecting, repository information, shard allocations, etc.

Flush

Thread pool statistics of the flush thread pool used for fsyncing to disk and merging segments in the storage engine.

Refresh

Thread pool statistics of the refresh thread pool used for automatic and on-demand refreshing of tables

Snapshot

Thread pool statistics of the snapshot thread pool used for creating and restoring snapshots.

ForceMerge

Thread pool statistics of the force_merge thread pool used when running an optimize statement.

Listener

Thread pool statistics of the listener thread pool used on client nodes for asynchronous result listeners.

Get

Thread pool statistics of the get thread pool used when querying sys.nodes or sys.shards.

FetchShardStarted

Thread pool statistics of the fetch_shard_started thread pool used on shard allocation .

FetchShardStore

Thread pool statistics of the fetch_shard_store used on shard replication.

Each of them returns a CompositeData object containing detailed statistics of each thread pool with the following attributes:

Name

Description

poolSize

The current number of threads in the pool.

largestPoolSize

The largest number of threads that have ever simultaneously been in the pool.

queueSize

The current number of tasks in the queue.

active

The approximate number of threads that are actively executing tasks.

completed

The approximate total number of tasks that have completed execution.

rejected

The number of rejected executions.

CircuitBreakers MXBean

The CircuitBreaker MXBean exposes statistical information about all availabe circuit breakers of a CrateDB node.

It can be accessed using the io.crate.monitoring:type=CircuitBreakers object name and has following attributes:

Name

Description

Parent

Statistics of the parent circuit breaker containing summarized counters accross all circuit breakers.

Query

Statistics of the query circuit breaker used to account memory usage of SQL execution including intermediate states e.g. on aggreation and resulting rows.

JobsLog

Statistics of the jobs_log circuit breaker used to account memory usage of the sys.jobs_log table.

OperationsLog

Statistics of the operations_log circuit breaker used to account memory usage of the sys.operations_log table.

FieldData

Statistics of the field_data circuit breaker used for estimating the amount of memory a field will require to be loaded into memory.

InFlightRequests

Statistics of the in_flight_requests circuit breaker used to account memory usage of all incoming requests on transport or HTTP level.

Request

Statistics of the request circuit breaker used to account memory usage of per-request data strucutre.

Each of them returns a CompositeData object containing detailed statistics of each circuit breaker with the following attributes:

Name

Description

name

The circuit breaker name this statistic belongs to.

used

The currently accounted used memory estimations.

limit

The configured limit when to trip.

overhead

The configured overhead used to account estimations.

trippedCount

The total number of occured trips.

Exposing JMX via HTTP

The JMX metrics and a readiness endpoint can be exposed via HTTP (e.g. to be used by Prometheus) by using the Crate JMX HTTP Exporter Java agent. See the README in the Crate JMX HTTP Exporter repository for more information.