heartbeat_agg() functions

Introduction

Given a series of timestamped heartbeats and a liveness interval, determine the overall liveness of a system. This aggregate can be used to report total uptime or downtime as well as report the time ranges where the system was live or dead.

It’s also possible to combine multiple heartbeat aggregates to determine the overall health of a service. For example, the heartbeat aggregates from a primary and standby server could be combine to see if there was ever a window where both machines were down at the same time.

Two-step aggregation

Hide content

This group of functions uses the two-step aggregation pattern.

Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.

Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.

The two-step aggregation pattern has several advantages:

  1. More efficient because multiple accessors can reuse the same aggregate
  2. Easier to reason about performance, because aggregation is separate from final computation
  3. Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
  4. Can perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result

To learn more, see the blog post on two-step aggregates.

Functions in this group

warning

This function group includes some experimental functions. Experimental functions might change or be removed in future releases. We do not recommend using them in production. Experimental functions are marked with an Experimental tag.

Aggregate

heartbeat_agg

ExperimentalCreate a liveness aggregate from a set of heartbeats

Accessor

dead_ranges

ExperimentalGet the down intervals from a heartbeat_agg

downtime

ExperimentalGet the total time dead during a heartbeat aggregate

interpolate

ExperimentalAdjust a heartbeat aggregate with predecessor information

interpolated_downtime

ExperimentalGet the total time dead from a heartbeat aggregate and predecessor

interpolated_uptime

ExperimentalGet the total time live from a heartbeat aggregate and predecessor

live_at

ExperimentalTest if the aggregate has a heartbeat covering a given time

live_ranges

ExperimentalGet the live intervals from a heartbeat_agg

uptime

ExperimentalGet the total time live during a heartbeat aggregate

Rollup

rollup

ExperimentalCombine multiple heartbeat aggregates

Function details

heartbeat_agg()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. heartbeat_agg(

`

  1. heartbeat TIMESTAMPTZ,
  2. agg_start TIMESTAMPTZ,
  3. agg_duration INTERVAL,
  4. heartbeat_liveness INTERVAL

`

  1. ) RETURNS HeartbeatAgg

`

This takes a set of heartbeat timestamps and aggregates the liveness state of the underlying system for the specified time range.

Required arguments

NameTypeDescription
heartbeatTIMESTAMPTZThe column containing the timestamps of the heartbeats.
agg_startTIMESTAMPTZThe start of the time range over which this aggregate is tracking liveness.
agg_durationINTERVALThe length of the time range over which this aggregate is tracking liveness. Any point in this range that doesn’t closely follow a heartbeat is considered to be dead.
heartbeat_livenessINTERVALHow long the system is considered to be live after each heartbeat.

Returns

ColumnTypeDescription
heartbeat_aggHeartbeatAggThe liveness data for the heartbeated system over the provided interval.

Examples

Given a table called system_health with a ping_time column, construct an aggregate of system liveness for 10 days starting from Jan 1, 2022. This assumes a system is unhealthy if it hasn’t been heard from in a 5 minute window:

  1. SELECT heartbeat_agg(
  2. ping_time,
  3. '01-01-2022 UTC',
  4. '10 days',
  5. '5 min')
  6. FROM system_health;

dead_ranges()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. dead_ranges(

`

  1. agg HEARTBEATAGG

`

  1. ) RETURNS TABLE (

`

  1. start TIMESTAMPTZ,
  2. end TIMESTAMPTZ

`

  1. )

`

Given a heartbeat aggregate, this will return a set of (starttime, endtime) pairs representing when the underlying system did not have a valid heartbeat during the interval of the aggregate.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Returns

ColumnTypeDescription
dead_rangesTABLE (start TIMESTAMPTZ, end TIMESTAMPTZ)The (start, end) pairs of when the system was down.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, we can use the following to get the intervals where the system was down during the week of Jan 9, 2022:

  1. SELECT dead_ranges(health)
  2. FROM liveness
  3. WHERE date = '01-9-2022 UTC'
  4. dead_ranges
  5. -----------------------------------------------------
  6. ("2022-01-09 00:00:00+00","2022-01-09 00:00:30+00")
  7. ("2022-01-12 15:27:22+00","2022-01-12 15:31:17+00")

downtime()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. downtime(

`

  1. agg HEARTBEATAGG

`

  1. ) RETURNS INTERVAL

`

Given a heartbeat aggregate, this will sum all the ranges where the system did not have a recent enough heartbeat.

There may appear to be some downtime between the start of the aggregate and the first heartbeat. If there is a heartbeat aggregage covering the previous period, you can use its last heartbeat to correct for this using interpolated_downtime().

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Returns

ColumnTypeDescription
downtimeINTERVALThe sum of all the dead ranges in the aggregate.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, we can use the following to get the total downtime of the system during the week of Jan 9, 2022:

  1. SELECT downtime(health)
  2. FROM liveness
  3. WHERE date = '01-9-2022 UTC'
  4. downtime
  5. --------
  6. 00:04:25

interpolate()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. interpolate(

`

  1. agg HEARTBEATAGG,
  2. pred HEARTBEATAGG

`

  1. ) RETURNS HEARTBEATAGG

`

This takes a heartbeat aggregate and the aggregate immediately preceding it. It updates the aggregate to include any live ranges that should have been carried over from the last heartbeat in the predecessor, even if there aren’t heartbeats for that range in the interval covered by this aggregate. It returns the updated aggregate, which can then be used with any of the heartbeat aggregate accessors.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate containing liveness data for a particular interval.

Optional arguments

NameTypeDescription
predHeartbeatAggThe heartbeat aggregate for the preceding interval, if one exists.

Returns

ColumnTypeDescription
interpolateHeartbeatAggA copy of agg which has been update to include any heartbeat intervals extending past the end of pred.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, we can use the following to get the intervals where the system was unhealthy during the week of Jan 9, 2022. This correctly excludes any ranges covered by a heartbeat at the end of the Jan 2 week:

  1. SELECT dead_ranges(
  2. interpolate(
  3. health,
  4. LAG(health) OVER (ORDER BY date)
  5. )
  6. )
  7. FROM liveness
  8. WHERE date = '01-9-2022 UTC'
  9. dead_ranges
  10. -----------------------------------------------------
  11. ("2022-01-12 15:27:22+00","2022-01-12 15:31:17+00")

interpolated_downtime()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. interpolated_downtime(

`

  1. agg HEARTBEATAGG,
  2. pred HEARTBEATAGG

`

  1. ) RETURNS INTERVAL

`

This behaves very similarly to downtime(), but it also takes the heartbeat aggregate from the preceding interval. It checks when the last heartbeat in the predecessor was received and makes sure not to consider the heartbeat interval after that time as unhealthy, even if it extends into the current aggregate prior to the first heartbeat.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Optional arguments

NameTypeDescription
predHeartbeatAggThe heartbeat aggregate for the interval before the one being measured, if one exists.

Returns

ColumnTypeDescription
interpolated_downtimeINTERVALThe sum of all the unhealthy ranges in the aggregate, excluding those covered by the last heartbeat of the previous interval.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, you can use this command to get the total interpolated downtime of the system during the week of Jan 9, 2022:

  1. SELECT interpolated_downtime(
  2. health,
  3. LAG(health) OVER (ORDER BY date)
  4. )
  5. FROM liveness
  6. WHERE date = '01-9-2022 UTC'
  7. interpolated_downtime
  8. ---------------------
  9. 00:03:55

interpolated_uptime()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. interpolated_uptime(

`

  1. agg HEARTBEATAGG,
  2. pred HEARTBEATAGG

`

  1. ) RETURNS INTERVAL

`

This behaves very similarly to uptime(), but it also takes the heartbeat aggregate from the preceding interval. It checks when the last heartbeat in the predecessor was received and makes sure that the entire heartbeat interval after that is considered live. This addresses the issue where uptime would consider the interval between the start of the interval and the first heartbeat as dead.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Optional arguments

NameTypeDescription
predHeartbeatAggThe heartbeat aggregate for the interval before the one being measured, if one exists.

Returns

ColumnTypeDescription
interpolated_uptimeINTERVALThe sum of all the live ranges in the aggregate, including those covered by the last heartbeat of the previous interval.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, you can use this command to get the total interpolated uptime of the system during the week of Jan 9, 2022:

  1. SELECT interpolated_uptime(
  2. health,
  3. LAG(health) OVER (ORDER BY date)
  4. )
  5. FROM liveness
  6. WHERE date = '01-9-2022 UTC'
  7. interpolated_uptime
  8. -------------------
  9. 6 days 23:56:05

live_at()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. live_at(

`

  1. agg HEARTBEATAGG,
  2. test TIMESTAMPTZ

`

  1. ) RETURNS BOOL

`

Given a heartbeat aggregate and a timestamp, this returns whether the aggregate has a heartbeat indicating the system was live at the given time.

Note that this returns false for any time not covered by the aggregate.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.
testTimestampTzThe time to test the liveness of.

Returns

ColumnTypeDescription
live_atboolTrue if the heartbeat aggregate had a heartbeat close before the test time.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, we can use the following to see if the sytem was live at a particular time:

  1. SELECT live_at(health, '2022-01-12 15:30:00+00')
  2. FROM liveness
  3. WHERE date = '01-9-2022 UTC'
  4. live_at
  5. ---------
  6. f

live_ranges()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. live_ranges(

`

  1. agg HEARTBEATAGG

`

  1. ) RETURNS TABLE (

`

  1. start TIMESTAMPTZ,
  2. end TIMESTAMPTZ

`

  1. )

`

Given a heartbeat aggregate, this returns a set of (starttime, endtime) pairs representing when the underlying system was live during the interval of the aggregate.

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Returns

ColumnTypeDescription
live_rangesTABLE (start TIMESTAMPTZ, end TIMESTAMPTZ)The (start, end) pairs of when the system was live.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, we can use the following to get the intervals where the system was live during the week of Jan 9, 2022:

  1. SELECT live_ranges(health)
  2. FROM liveness
  3. WHERE date = '01-9-2022 UTC'
  4. live_ranges
  5. -----------------------------------------------------
  6. ("2022-01-09 00:00:30+00","2022-01-12 15:27:22+00")
  7. ("2022-01-12 15:31:17+00","2022-01-16 00:00:00+00")

uptime()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. uptime(

`

  1. agg HEARTBEATAGG

`

  1. ) RETURNS INTERVAL

`

Given a heartbeat aggregate, this sums all the ranges where the system was live and returns the total.

There may appear to be some downtime between the start of the aggregate and the first heartbeat. If there is a heartbeat aggregage covering the previous period, you can use its last heartbeat to correct for this using interpolated_uptime().

Required arguments

NameTypeDescription
aggHeartbeatAggA heartbeat aggregate to get the liveness data from.

Returns

ColumnTypeDescription
uptimeINTERVALThe sum of all the live ranges in the aggregate.

Examples

Given a table called liveness containing weekly heartbeat aggregates in column health with timestamp column date, you can use this command to get the total uptime of the system during the week of Jan 9, 2022:

  1. SELECT uptime(health)
  2. FROM liveness
  3. WHERE date = '01-9-2022 UTC'
  4. uptime
  5. -----------------
  6. 6 days 23:55:35

rollup()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.13.0

Hide content

`

  1. rollup(

`

  1. heartbeatagg HEARTBEATAGG

`

  1. ) RETURNS HEARTBEATAGG

`

This combines multiple heartbeat aggregates into one. This can be used to combine aggregates into adjacent intervals into one larger interval, such as rolling daily aggregates into a weekly or monthly aggregate.

Another use for this is to combine heartbeat aggregates for redundant systems to determine if there were any overlapping failures. For instance, a master and standby system can have their heartbeats combined to see if there were any intervals where both systems were down at the same time. The result of rolling overlapping heartbeats together like this is a heartbeat aggregate which considers a time live if any of its component aggregates were live.

Required arguments

NameTypeDescription
heartbeataggHeartbeatAggThe heartbeat aggregates to roll up.

Returns

ColumnTypeDescription
rollupHeartbeatAggA heartbeat aggregate covering the interval from the earliest start time of its component aggregates to the latest end time. It combines the live ranges of all the components.