max_n() functions

Introduction

Get the N largest values from a column.

The max_n() functions give the same results as the regular SQL query SELECT ... ORDER BY ... LIMIT n. But unlike the SQL query, they can be composed and combined like other aggregate hyperfunctions.

To get the N smallest values, use min_n(). To get the N largest values with accompanying data, use max_n_by().

Two-step aggregation

Hide content

This group of functions uses the two-step aggregation pattern.

Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.

Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.

The two-step aggregation pattern has several advantages:

  1. More efficient because multiple accessors can reuse the same aggregate
  2. Easier to reason about performance, because aggregation is separate from final computation
  3. Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
  4. Can perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result

To learn more, see the blog post on two-step aggregates.

Functions in this group

warning

This function group includes some experimental functions. Experimental functions might change or be removed in future releases. We do not recommend using them in production. Experimental functions are marked with an Experimental tag.

Aggregate

max_n

ExperimentalFind the largest values in a set of data

Accessor

into_array

ExperimentalReturns an array of the highest values from a MaxN aggregate

into_values

ExperimentalReturns the highest values from a MaxN aggregate

Rollup

rollup

ExperimentalCombine multiple MaxN aggregates

Function details

max_n()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. max_n(

`

  1. value BIGINT | DOUBLE PRECISION | TIMESTAMPTZ,
  2. capacity BIGINT

`

  1. ) MaxN

`

Construct an aggregate which will keep track of the largest values passed through it.

Required arguments

NameTypeDescription
valueBIGINT, DOUBLE PRECISION, TIMESTAMPTZThe values passed into the aggregate
capacityBIGINTThe number of values to retain.

Returns

ColumnTypeDescription
max_nMaxNThe compiled aggregate. Note that the exact type will be MaxInts, MaxFloats, or MaxTimes depending on the input type

into_array()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. into_array (

`

  1. agg MaxN

`

  1. ) BIGINT[] | DOUBLE PRECISION[] | TIMESTAMPTZ[]

`

Return the N largest values seen by the aggregate. The values are formatted as an array in decreasing order.

Required arguments

NameTypeDescription
aggMaxNThe aggregate to return the results from. Note that the exact type here varies based on the type of data stored in the aggregate.

Returns

ColumnTypeDescription
into_arrayBIGINT[], DOUBLE PRECISION[], TIMESTAMPTZ[]The largest values seen while creating this aggregate.

Examples

Find the top 5 values from i * 13 % 10007 for i = 1 to 10000:

  1. SELECT toolkit_experimental.into_array(
  2. toolkit_experimental.max_n(sub.val, 5))
  3. FROM (
  4. SELECT (i * 13) % 10007 AS val
  5. FROM generate_series(1,10000) as i
  6. ) sub;
  7. into_array
  8. ---------------------------------
  9. {10006,10005,10004,10003,10002}

into_values()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. into_values (

`

  1. agg MaxN

`

  1. ) SETOF BIGINT | SETOF DOUBLE PRECISION | SETOF TIMESTAMPTZ

`

Return the N largest values seen by the aggregate.

Required arguments

NameTypeDescription
aggMaxNThe aggregate to return the results from. Note that the exact type here varies based on the type of data stored.

Returns

ColumnTypeDescription
into_valuesSETOF BIGINT, SETOF DOUBLE PRECISION, SETOF TIMESTAMPTZThe largest values seen while creating this aggregate.

Examples

Find the top 5 values from i * 13 % 10007 for i = 1 to 10000:

  1. SELECT toolkit_experimental.into_values(
  2. toolkit_experimental.max_n(sub.val, 5))
  3. FROM (
  4. SELECT (i * 13) % 10007 AS val
  5. FROM generate_series(1,10000) as i
  6. ) sub;
  7. into_values
  8. -------------
  9. 10006
  10. 10005
  11. 10004
  12. 10003
  13. 10002

rollup()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. rollup(

`

  1. agg MaxN

`

  1. ) MaxN

`

This aggregate combines the aggregates generated by other max_n aggregates. Combined with an accessor, it returns the maximum values found across all the aggregated data.

Required arguments

NameTypeDescription
aggMaxNThe aggregates being combined

Returns

ColumnTypeDescription
rollupMaxNAn aggregate over all of the contributing values.

Extended examples

Get the 10 largest transactions from a table of stock trades

This example assumes that you have a table of stock trades in this format:

  1. CREATE TABLE stock_sales(
  2. ts TIMESTAMPTZ,
  3. symbol TEXT,
  4. price FLOAT,
  5. volume INT
  6. );

You can query for the 10 largest transactions each day:

  1. WITH t as (
  2. SELECT
  3. time_bucket('1 day'::interval, ts) as day,
  4. toolkit_experimental.max_n(price * volume, 10) AS daily_max
  5. FROM stock_sales
  6. GROUP BY time_bucket('1 day'::interval, ts)
  7. )
  8. SELECT
  9. day, toolkit_experimental.as_array(daily_max)
  10. FROM t;