Custom Chart Debug Page

The Custom Chart debug page in the Admin UI can be used to create one or multiple custom charts showing any combination of over 200 available metrics.

The definition of the customized dashboard is encoded in the URL. To share the dashboard with someone, send them the URL. Like any other URL, it can be bookmarked, sit in a pinned tab in your browser, etc.

Accessing the Custom Chart page

To access the Custom Chart debug page, access the Admin UI, and either:

  • Open http://localhost:8080/#/debug/chart in your browser (replacing localhost and 8080 with your node's host and port).

  • Click the gear icon on the left to access the Advanced Debugging Page. In the Reports section, click Custom TimeSeries Chart.

Using the Custom Chart page

CockroachDB Admin UI

On the Custom Chart page, you can set the time span for all charts, add new custom charts, and customize each chart:

  • To set the time span for the page, use the dropdown menu above the charts and select the desired time span.

  • To add a chart, click Add Chart and customize the new chart.

  • To customize each chart, use the Units dropdown menu to set the units to display. Then use the table below the chart to select the metrics being queried, and how they'll be combined and displayed. Options include:

    Column Description Metric Name How the system refers to this metric, e.g., sql.bytesin. Downsampler
    The "Downsampler" operation is used to combine the individual datapoints over the longer period into a single datapoint. We store one data point every ten seconds, but for queries over long time spans the backend lowers the resolution of the returned data, perhaps only returning one data point for every minute, five minutes, or even an entire hour in the case of the 30 day view.

    Options:

    • AVG: Returns the average value over the time period.
    • MIN: Returns the lowest value seen.
    • MAX: Returns the highest value seen.
    • SUM: Returns the sum of all values seen.

    Aggregator
    Used to combine data points from different nodes. It has the same operations available as the Downsampler.

    Options:

    • AVG: Returns the average value over the time period.
    • MIN: Returns the lowest value seen.
    • MAX: Returns the highest value seen.
    • SUM: Returns the sum of all values seen.

    Rate
    Determines how to display the rate of change during the selected time period.

    Options:

  • Normal: Returns the actual recorded value.
  • Rate: Returns the rate of change of the value per second.
  • Non-negative Rate: Returns the rate-of-change, but returns 0 instead of negative values. A large number of the stats we track are actually tracked as monotonically increasing counters so each sample is just the total value of that counter. The rate of change of that counter represents the rate of events being counted, which is usually what you want to graph. "Non-negative Rate" is needed because the counters are stored in memory, and thus if a node resets it goes back to zero (whereas normally they only increase).

    Source The set of nodes being queried, which is either:

  • The entire cluster.
  • A single, named node. Per Node If checked, the chart will show a line for each node's value of this metric.

Examples

Query user and system CPU usage

CockroachDB Admin UI

To compare system vs. userspace CPU usage, select the following values under Metric Name:

  • sys.cpu.sys.percent
  • sys.cpu.user.percent
    The Y-axis label is the Count. A count of 1 represents 100% utilization. The Aggregator of Sum can show the count to be above 1, which would mean CPU utilization is greater than 100%.

Checking Per Node displays statistics for each node, which could show whether an individual node's CPU usage was higher or lower than the average.

Available metrics

Note:

This list is taken directly from the source code and is subject to change. Some of the metrics listed below are already visible in other areas of the Admin UI.

NameHelp
addsstable.applicationsNumber of SSTable ingestions applied (i.e., applied by Replicas)
addsstable.copiesNumber of SSTable ingestions that required copying files during application
addsstable.proposalsNumber of SSTable ingestions proposed (i.e., sent to Raft by lease holders)
build.timestampBuild information
capacity.availableAvailable storage capacity
capacity.reservedCapacity reserved for snapshots
capacity.usedUsed storage capacity
capacityTotal storage capacity
clock-offset.meannanosMean clock offset with other nodes in nanoseconds
clock-offset.stddevnanosStd dev clock offset with other nodes in nanoseconds
compactor.compactingnanosNumber of nanoseconds spent compacting ranges
compactor.compactions.failureNumber of failed compaction requests sent to the storage engine
compactor.compactions.successNumber of successful compaction requests sent to the storage engine
compactor.suggestionbytes.compactedNumber of logical bytes compacted from suggested compactions
compactor.suggestionbytes.queuedNumber of logical bytes in suggested compactions in the queue
compactor.suggestionbytes.skippedNumber of logical bytes in suggested compactions which were not compacted
distsender.batches.partialNumber of partial batches processed
distsender.batchesNumber of batches processed
distsender.errors.notleaseholderNumber of NotLeaseHolderErrors encountered
distsender.rpc.sent.localNumber of local RPCs sent
distsender.rpc.sent.nextreplicaerrorNumber of RPCs sent due to per-replica errors
distsender.rpc.sentNumber of RPCs sent
exec.errorNumber of batch KV requests that failed to execute on this node
exec.latencyLatency in nanoseconds of batch KV requests executed on this node
exec.successNumber of batch KV requests executed successfully on this node
gcbytesageCumulative age of non-live data in seconds
gossip.bytes.receivedNumber of received gossip bytes
gossip.bytes.sentNumber of sent gossip bytes
gossip.connections.incomingNumber of active incoming gossip connections
gossip.connections.outgoingNumber of active outgoing gossip connections
gossip.connections.refusedNumber of refused incoming gossip connections
gossip.infos.receivedNumber of received gossip Info objects
gossip.infos.sentNumber of sent gossip Info objects
intentageCumulative age of intents in seconds
intentbytesNumber of bytes in intent KV pairs
intentcountCount of intent keys
keybytesNumber of bytes taken up by keys
keycountCount of all keys
lastupdatenanosTime in nanoseconds since Unix epoch at which bytes/keys/intents metrics were last updated
leases.epochNumber of replica leaseholders using epoch-based leases
leases.errorNumber of failed lease requests
leases.expirationNumber of replica leaseholders using expiration-based leases
leases.successNumber of successful lease requests
leases.transfers.errorNumber of failed lease transfers
leases.transfers.successNumber of successful lease transfers
livebytesNumber of bytes of live data (keys plus values)
livecountCount of live keys
liveness.epochincrementsNumber of times this node has incremented its liveness epoch
liveness.heartbeatfailuresNumber of failed node liveness heartbeats from this node
liveness.heartbeatlatencyNode liveness heartbeat latency in nanoseconds
liveness.heartbeatsuccessesNumber of successful node liveness heartbeats from this node
liveness.livenodesNumber of live nodes in the cluster (will be 0 if this node is not itself live)
node-idnode ID with labels for advertised RPC and HTTP addresses
queue.consistency.pendingNumber of pending replicas in the consistency checker queue
queue.consistency.process.failureNumber of replicas which failed processing in the consistency checker queue
queue.consistency.process.successNumber of replicas successfully processed by the consistency checker queue
queue.consistency.processingnanosNanoseconds spent processing replicas in the consistency checker queue
queue.gc.info.abortspanconsideredNumber of AbortSpan entries old enough to be considered for removal
queue.gc.info.abortspangcnumNumber of AbortSpan entries fit for removal
queue.gc.info.abortspanscannedNumber of transactions present in the AbortSpan scanned from the engine
queue.gc.info.intentsconsideredNumber of 'old' intents
queue.gc.info.intenttxnsNumber of associated distinct transactions
queue.gc.info.numkeysaffectedNumber of keys with GC'able data
queue.gc.info.pushtxnNumber of attempted pushes
queue.gc.info.resolvesuccessNumber of successful intent resolutions
queue.gc.info.resolvetotalNumber of attempted intent resolutions
queue.gc.info.transactionspangcabortedNumber of GC'able entries corresponding to aborted txns
queue.gc.info.transactionspangccommittedNumber of GC'able entries corresponding to committed txns
queue.gc.info.transactionspangcpendingNumber of GC'able entries corresponding to pending txns
queue.gc.info.transactionspanscannedNumber of entries in transaction spans scanned from the engine
queue.gc.pendingNumber of pending replicas in the GC queue
queue.gc.process.failureNumber of replicas which failed processing in the GC queue
queue.gc.process.successNumber of replicas successfully processed by the GC queue
queue.gc.processingnanosNanoseconds spent processing replicas in the GC queue
queue.raftlog.pendingNumber of pending replicas in the Raft log queue
queue.raftlog.process.failureNumber of replicas which failed processing in the Raft log queue
queue.raftlog.process.successNumber of replicas successfully processed by the Raft log queue
queue.raftlog.processingnanosNanoseconds spent processing replicas in the Raft log queue
queue.raftsnapshot.pendingNumber of pending replicas in the Raft repair queue
queue.raftsnapshot.process.failureNumber of replicas which failed processing in the Raft repair queue
queue.raftsnapshot.process.successNumber of replicas successfully processed by the Raft repair queue
queue.raftsnapshot.processingnanosNanoseconds spent processing replicas in the Raft repair queue
queue.replicagc.pendingNumber of pending replicas in the replica GC queue
queue.replicagc.process.failureNumber of replicas which failed processing in the replica GC queue
queue.replicagc.process.successNumber of replicas successfully processed by the replica GC queue
queue.replicagc.processingnanosNanoseconds spent processing replicas in the replica GC queue
queue.replicagc.removereplicaNumber of replica removals attempted by the replica gc queue
queue.replicate.addreplicaNumber of replica additions attempted by the replicate queue
queue.replicate.pendingNumber of pending replicas in the replicate queue
queue.replicate.process.failureNumber of replicas which failed processing in the replicate queue
queue.replicate.process.successNumber of replicas successfully processed by the replicate queue
queue.replicate.processingnanosNanoseconds spent processing replicas in the replicate queue
queue.replicate.purgatoryNumber of replicas in the replicate queue's purgatory, awaiting allocation options
queue.replicate.rebalancereplicaNumber of replica rebalancer-initiated additions attempted by the replicate queue
queue.replicate.removedeadreplicaNumber of dead replica removals attempted by the replicate queue (typically in response to a node outage)
queue.replicate.removereplicaNumber of replica removals attempted by the replicate queue (typically in response to a rebalancer-initiated addition)
queue.replicate.transferleaseNumber of range lease transfers attempted by the replicate queue
queue.split.pendingNumber of pending replicas in the split queue
queue.split.process.failureNumber of replicas which failed processing in the split queue
queue.split.process.successNumber of replicas successfully processed by the split queue
queue.split.processingnanosNanoseconds spent processing replicas in the split queue
queue.tsmaintenance.pendingNumber of pending replicas in the time series maintenance queue
queue.tsmaintenance.process.failureNumber of replicas which failed processing in the time series maintenance queue
queue.tsmaintenance.process.successNumber of replicas successfully processed by the time series maintenance queue
queue.tsmaintenance.processingnanosNanoseconds spent processing replicas in the time series maintenance queue
raft.commandsappliedCount of Raft commands applied
raft.enqueued.pendingNumber of pending outgoing messages in the Raft Transport queue
raft.heartbeats.pendingNumber of pending heartbeats and responses waiting to be coalesced
raft.process.commandcommit.latencyLatency histogram in nanoseconds for committing Raft commands
raft.process.logcommit.latencyLatency histogram in nanoseconds for committing Raft log entries
raft.process.tickingnanosNanoseconds spent in store.processRaft() processing replica.Tick()
raft.process.workingnanosNanoseconds spent in store.processRaft() working
raft.rcvd.appNumber of MsgApp messages received by this store
raft.rcvd.apprespNumber of MsgAppResp messages received by this store
raft.rcvd.droppedNumber of dropped incoming Raft messages
raft.rcvd.heartbeatNumber of (coalesced, if enabled) MsgHeartbeat messages received by this store
raft.rcvd.heartbeatrespNumber of (coalesced, if enabled) MsgHeartbeatResp messages received by this store
raft.rcvd.prevoteNumber of MsgPreVote messages received by this store
raft.rcvd.prevoterespNumber of MsgPreVoteResp messages received by this store
raft.rcvd.propNumber of MsgProp messages received by this store
raft.rcvd.snapNumber of MsgSnap messages received by this store
raft.rcvd.timeoutnowNumber of MsgTimeoutNow messages received by this store
raft.rcvd.transferleaderNumber of MsgTransferLeader messages received by this store
raft.rcvd.voteNumber of MsgVote messages received by this store
raft.rcvd.voterespNumber of MsgVoteResp messages received by this store
raft.ticksNumber of Raft ticks queued
raftlog.behindNumber of Raft log entries followers on other stores are behind
raftlog.truncatedNumber of Raft log entries truncated
range.addsNumber of range additions
range.raftleadertransfersNumber of raft leader transfers
range.removesNumber of range removals
range.snapshots.generatedNumber of generated snapshots
range.snapshots.normal-appliedNumber of applied snapshots
range.snapshots.preemptive-appliedNumber of applied pre-emptive snapshots
range.splitsNumber of range splits
ranges.unavailableNumber of ranges with fewer live replicas than needed for quorum
ranges.underreplicatedNumber of ranges with fewer live replicas than the replication target
rangesNumber of ranges
rebalancing.writespersecondNumber of keys written (i.e., applied by raft) per second to the store, averaged over a large time period as used in rebalancing decisions
replicas.commandqueue.combinedqueuesizeNumber of commands in all CommandQueues combined
replicas.commandqueue.combinedreadcountNumber of read-only commands in all CommandQueues combined
replicas.commandqueue.combinedwritecountNumber of read-write commands in all CommandQueues combined
replicas.commandqueue.maxoverlapsLargest number of overlapping commands seen when adding to any CommandQueue
replicas.commandqueue.maxreadcountLargest number of read-only commands in any CommandQueue
replicas.commandqueue.maxsizeLargest number of commands in any CommandQueue
replicas.commandqueue.maxtreesizeLargest number of intervals in any CommandQueue's interval tree
replicas.commandqueue.maxwritecountLargest number of read-write commands in any CommandQueue
replicas.leaders_not_leaseholdersNumber of replicas that are Raft leaders whose range lease is held by another store
replicas.leadersNumber of raft leaders
replicas.leaseholdersNumber of lease holders
replicas.quiescentNumber of quiesced replicas
replicas.reservedNumber of replicas reserved for snapshots
replicasNumber of replicas
requests.backpressure.splitNumber of backpressured writes waiting on a Range split
requests.slow.commandqueueNumber of requests that have been stuck for a long time in the command queue
requests.slow.distsenderNumber of requests that have been stuck for a long time in the dist sender
requests.slow.leaseNumber of requests that have been stuck for a long time acquiring a lease
requests.slow.raftNumber of requests that have been stuck for a long time in raft
rocksdb.block.cache.hitsCount of block cache hits
rocksdb.block.cache.missesCount of block cache misses
rocksdb.block.cache.pinned-usageBytes pinned by the block cache
rocksdb.block.cache.usageBytes used by the block cache
rocksdb.bloom.filter.prefix.checkedNumber of times the bloom filter was checked
rocksdb.bloom.filter.prefix.usefulNumber of times the bloom filter helped avoid iterator creation
rocksdb.compactionsNumber of table compactions
rocksdb.flushesNumber of table flushes
rocksdb.memtable.total-sizeCurrent size of memtable in bytes
rocksdb.num-sstablesNumber of rocksdb SSTables
rocksdb.read-amplificationNumber of disk reads per query
rocksdb.table-readers-mem-estimateMemory used by index and filter blocks
round-trip-latencyDistribution of round-trip latencies with other nodes in nanoseconds
security.certificate.expiration.caExpiration timestamp in seconds since Unix epoch for the CA certificate. 0 means no certificate or error.
security.certificate.expiration.nodeExpiration timestamp in seconds since Unix epoch for the node certificate. 0 means no certificate or error.
sql.bytesinNumber of sql bytes received
sql.bytesoutNumber of sql bytes sent
sql.connsNumber of active sql connections
sql.ddl.countNumber of SQL DDL statements
sql.delete.countNumber of SQL DELETE statements
sql.distsql.exec.latencyLatency in nanoseconds of DistSQL statement execution
sql.distsql.flows.activeNumber of distributed SQL flows currently active
sql.distsql.flows.totalNumber of distributed SQL flows executed
sql.distsql.queries.activeNumber of distributed SQL queries currently active
sql.distsql.queries.totalNumber of distributed SQL queries executed
sql.distsql.select.countNumber of DistSQL SELECT statements
sql.distsql.service.latencyLatency in nanoseconds of DistSQL request execution
sql.exec.latencyLatency in nanoseconds of SQL statement execution
sql.insert.countNumber of SQL INSERT statements
sql.mem.currentCurrent sql statement memory usage
sql.mem.distsql.currentCurrent sql statement memory usage for distsql
sql.mem.distsql.maxMemory usage per sql statement for distsql
sql.mem.maxMemory usage per sql statement
sql.mem.session.currentCurrent sql session memory usage
sql.mem.session.maxMemory usage per sql session
sql.mem.txn.currentCurrent sql transaction memory usage
sql.mem.txn.maxMemory usage per sql transaction
sql.misc.countNumber of other SQL statements
sql.query.countNumber of SQL queries
sql.select.countNumber of SQL SELECT statements
sql.service.latencyLatency in nanoseconds of SQL request execution
sql.txn.abort.countNumber of SQL transaction ABORT statements
sql.txn.begin.countNumber of SQL transaction BEGIN statements
sql.txn.commit.countNumber of SQL transaction COMMIT statements
sql.txn.rollback.countNumber of SQL transaction ROLLBACK statements
sql.update.countNumber of SQL UPDATE statements
sys.cgo.allocbytesCurrent bytes of memory allocated by cgo
sys.cgo.totalbytesTotal bytes of memory allocated by cgo, but not released
sys.cgocallsTotal number of cgo call
sys.cpu.sys.nsTotal system cpu time in nanoseconds
sys.cpu.sys.percentCurrent system cpu percentage
sys.cpu.user.nsTotal user cpu time in nanoseconds
sys.cpu.user.percentCurrent user cpu percentage
sys.fd.openProcess open file descriptors
sys.fd.softlimitProcess open FD soft limit
sys.gc.countTotal number of GC runs
sys.gc.pause.nsTotal GC pause in nanoseconds
sys.gc.pause.percentCurrent GC pause percentage
sys.go.allocbytesCurrent bytes of memory allocated by go
sys.go.totalbytesTotal bytes of memory allocated by go, but not released
sys.goroutinesCurrent number of goroutines
sys.rssCurrent process RSS
sys.uptimeProcess uptime in seconds
sysbytesNumber of bytes in system KV pairs
syscountCount of system KV pairs
timeseries.write.bytesTotal size in bytes of metric samples written to disk
timeseries.write.errorsTotal errors encountered while attempting to write metrics to disk
timeseries.write.samplesTotal number of metric samples written to disk
totalbytesTotal number of bytes taken up by keys and values including non-live data
tscache.skl.read.pagesNumber of pages in the read timestamp cache
tscache.skl.read.rotationsNumber of page rotations in the read timestamp cache
tscache.skl.write.pagesNumber of pages in the write timestamp cache
tscache.skl.write.rotationsNumber of page rotations in the write timestamp cache
txn.abandonsNumber of abandoned KV transactions
txn.abortsNumber of aborted KV transactions
txn.autoretriesNumber of automatic retries to avoid serializable restarts
txn.commits1PCNumber of committed one-phase KV transactions
txn.commitsNumber of committed KV transactions (including 1PC)
txn.durationsKV transaction durations in nanoseconds
txn.restarts.deleterangeNumber of restarts due to a forwarded commit timestamp and a DeleteRange command
txn.restarts.possiblereplayNumber of restarts due to possible replays of command batches at the storage layer
txn.restarts.serializableNumber of restarts due to a forwarded commit timestamp and isolation=SERIALIZABLE
txn.restarts.writetoooldNumber of restarts due to a concurrent writer committing first
txn.restartsNumber of restarted KV transactions
valbytesNumber of bytes taken up by values
valcountCount of all values

See also

Was this page helpful?
YesNo