Metrics

Metrics related to vreplication functionality

VReplication exports several metrics using the expvars interface. These are available at the /debug/vars endpoint of vttablet’s http status pages. More details can be found here.

Target Tablet Metrics

VReplicationCopyLoopCount, VReplicationCopyLoopCountTotal

During the copy phase we run one loop of bulk copy for approximately an hour at a time (by default) before running catchup. VReplicationCopyLoopCount counts the number of times this loop has run for each stream and VReplicationCopyLoopCountTotal the total across all streams.

VReplicationCopyRowCount, VReplicationCopyRowCountTotal

VReplicationCopyRowCount counts the number of rows copied during the copy phase per stream and VReplicationCopyRowCountTotal the total across all streams.

VReplicationErrors

VReplicationErrors counts the number of times errors occurred during vreplication. Errors are keyed by the type of error.

VReplicationHeartbeat

VReplicationHeartbeat records, for each stream, the timestamp sent by the last heartbeat event for that stream.

VReplicationMessages

VReplicationMessages contains a stack of the last N (currently 3) messages of a vreplication stream.

VReplicationPhaseTimings, VReplicationPhaseTimingsCounts, VReplicationPhaseTimingsTotal

This metric relates to the times each phase is run during the lifetime of a stream. VReplicationPhaseTimings counts the total time taken by the runs, VReplicationPhaseTimingsCounts the number of runs and VReplicationPhaseTimingsTotals the total runs across all streams.

VReplicationTableCopyRowCounts

VReplicationTableCopyRowCounts counts the number of rows copied during the copy phase per table per stream.

VReplicationTableCopyTimings

VReplicationTableCopyTimings counts the time taken per table per stream during the copy phase of the stream. Unlike VReplicationPhaseTimings, this metric updates continuously, rather than being set once at the end of the copy phase.

VReplicationQPS

VReplicationQPS is a list of QPS values for each loop of each phase of the workflow.

VReplicationQueryCount, VReplicationQueryCountTotal

VReplicationQueryCount is the total number of queries in each phase of a workflow. VReplicationQueryCountTotal is the total queries across all phases and workflows.

VReplicationLagSeconds, VReplicationLagSecondsMax, VReplicationLagSecondsTotal

These metrics show the replication lag of the target stream with respect to the source stream. VReplicationLagSeconds shows the current replication lag and VReplicationLagSecondsMax has the maximum lag in this stream. Note that these values are only valid during the replication phase of a workflow.

VReplicationSource

Shows the keyspace and shard of the source from which this target stream is replicating

VReplicationSourceTablet

Shows the tablet from which this stream is currently replicating

VReplicationStreamCount

The number of streams running on this target

VReplicationStreamState

This shows the state of each stream.

Source Tablet Metrics

VStreamPacketSize

The value of the vstream_packet_size flag specified for this tablet

VStreamerCount

The current number of running vstreamers

VStreamerErrors

The number of errors per category across workflows

VStreamersEndedWithErrors

The total number of errors that caused a stream to stall

VStreamerEventsStreamed

The total number of events streamed by this vttablet across all workflows

VStreamerCompressedTransactionsDecoded

The total number of compressed transactions (MySQL’s binlog_transaction_compression=ON) decoded by this vttablet across all workflows

VStreamerNumPackets

The total number of packets sent by this vttablet across all workflows

VStreamersCreated

The total number of vstreamers created during the lifetime of this tablet


Example

A snippet from tablet 200 from the local example after running the MoveTables step

  1. "VReplicationCopyLoopCount": {"commerce.0.commerce2customer.1": 2},
  2. "VReplicationCopyLoopCountTotal": 2,
  3. "VReplicationCopyRowCount": {"commerce.0.commerce2customer.1": 10},
  4. "VReplicationCopyRowCountTotal": 10,
  5. "VReplicationErrors": {},
  6. "VReplicationHeartbeat": {"commerce.0.commerce2customer.1": 1618681048},
  7. "VReplicationMessages": {"1": "2021-04-17T19:36:13.003858838+02:00:Picked source tablet: cell:\"zone1\" uid:100 "},
  8. "VReplicationPhaseTimings": {"commerce.0.commerce2customer.1.catchup": 1000935083, "commerce.0.commerce2customer.1.fastforward": 15349583, "commerce.0.commerce2customer.1.copy": 63353125},
  9. "VReplicationPhaseTimingsCounts": {"commerce.0.commerce2customer.1.copy": 2, "commerce.0.commerce2customer.1.All": 6, "commerce.0.commerce2customer.1.catchup": 2, "commerce.0.commerce2customer.1.fastforward": 2},
  10. "VReplicationPhaseTimingsTotal": 1079637791,
  11. "VReplicationQPS": {"All":[11.8,1,1.2,1.2,1,1.2,1,1.2,1,1.2,1,1.2,1.2,1,1.2,1,1.2],"Query":[11.2,1,1.2,1.2,1,1.2,1,1.2,1,1.2,1,1.2,1.2,1,1.2,1,1.2],"Transaction":[0.6,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]},
  12. "VReplicationQueryCount": {"commerce.0.commerce2customer.1.copy": 2},
  13. "VReplicationQueryCountTotal": 2,
  14. "VReplicationLagSeconds": {"commerce.0.commerce2customer.1": 0},
  15. "VReplicationLagSecondsMax": 0,
  16. "VReplicationLagSecondsTotal": 0,
  17. "VReplicationSource": {"1": "commerce/0"},
  18. "VReplicationSourceTablet": {"1": "cell:\"zone1\" uid:100 "},
  19. "VReplicationStreamCount": 1,
  20. "VReplicationStreamState": {"commerce2customer.1": "Running"},
  21. "VReplicationTableCopyRowCounts": {"commerce.0.commerce2customer.1.corder": 4, "commerce.0.commerce2customer.1.customer": 2},
  22. "VReplicationTableCopyTimings": {"commerce.0.commerce2customer.1.customer": 6707583, "commerce.0.commerce2customer.1.corder": 13254250},
  23. "VStreamPacketSize": 250000,
  24. "VStreamerCount": 0,
  25. "VStreamerErrors": {"Catchup": 0, "Copy": 0, "Send": 0, "TablePlan": 0},
  26. "VStreamerEventsStreamed": 0,
  27. "VStreamerNumPackets": 0,
  28. "VStreamerPhaseTiming": {"TotalCount":0,"TotalTime":0,"Histograms":{}},
  29. "VStreamersCreated": 0,
  30. "VStreamersEndedWithErrors": 0,

VTGate Metrics

VStreamsCreated

The total number of vstreams created during the lifetime of this vtgate.

VStreamsLag

The difference in seconds between the current time when the vstream event was sent and the time when the binlog event occurred.