max_n_by() functions

Introduction

Get the N largest values from a column, with an associated piece of data per value. For example, you can return an accompanying column, or the full row.

The max_n_by() functions give the same results as the regular SQL query SELECT ... ORDER BY ... LIMIT n. But unlike the SQL query, they can be composed and combined like other aggregate hyperfunctions.

To get the N smallest values with accompanying data, use min_n_by(). To get the N largest values without accompanying data, use max_n().

Two-step aggregation

Hide content

This group of functions uses the two-step aggregation pattern.

Rather than calculating the final result in one step, you first create an intermediate aggregate by using the aggregate function.

Then, use any of the accessors on the intermediate aggregate to calculate a final result. You can also roll up multiple intermediate aggregates with the rollup functions.

The two-step aggregation pattern has several advantages:

  1. More efficient because multiple accessors can reuse the same aggregate
  2. Easier to reason about performance, because aggregation is separate from final computation
  3. Easier to understand when calculations can be rolled up into larger intervals, especially in window functions and continuous aggregates
  4. Can perform retrospective analysis even when underlying data is dropped, because the intermediate aggregate stores extra information not available in the final result

To learn more, see the blog post on two-step aggregates.

Functions in this group

warning

This function group includes some experimental functions. Experimental functions might change or be removed in future releases. We do not recommend using them in production. Experimental functions are marked with an Experimental tag.

Aggregate

max_n_by

ExperimentalTrack the largest values and associated data in a set of values

Accessor

into_values

ExperimentalReturns the highest values and associated data from a MaxNBy aggregate

Rollup

rollup

ExperimentalCombine multiple MaxNBy aggregates

Function details

max_n_by()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. max_n_by(

`

  1. value BIGINT | DOUBLE PRECISION | TIMESTAMPTZ,
  2. data ANYELEMENT,
  3. capacity BIGINT

`

  1. ) MaxNBy

`

Construct an aggregate that keeps track of the largest values passed through it, as well as some associated data which is passed alongside the value.

Required arguments

NameTypeDescription
valueBIGINT, DOUBLE PRECISION, TIMESTAMPTZThe values passed into the aggregate
dataANYELEMENTThe data associated with a particular value
capacityBIGINTThe number of values to retain.

Returns

ColumnTypeDescription
max_n_byMaxNByThe compiled aggregate. Note that the exact type will be MaxByInts, MaxByFloats, or MaxByTimes depending on the input type

into_values()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. into_values(

`

  1. agg MaxNBy,
  2. dummy ANYELEMENT

`

  1. ) TABLE (

`

  1. value BIGINT | DOUBLE PRECISION | TIMESTAMPTZ,
  2. data ANYELEMENT

`

  1. )

`

This will return the largest values seen by the aggregate and the corresponding values associated with them. Note that PostgresQL requires an input argument with type matching the associated value in order to deterimine the response type.

Required arguments

NameTypeDescription
aggMaxNByThe aggregate to return the results from. Note that the exact type here varies based on the type of data stored.
dummyANYELEMENTThis is purely to inform PostgresQL of the response type. A NULL cast to the appropriate type is typical.

Returns

ColumnTypeDescription
into_valuesTABLE (value BIGINT, DOUBLE PRECISION, TIMESTAMPTZ, data ANYELEMENT)The largest values and associated data seen while creating this aggregate.

Examples

Find the top 5 values from i * 13 % 10007 for i = 1 to 10000, and the integer result of the operation that generated that modulus:

  1. SELECT toolkit_experimental.into_values(
  2. toolkit_experimental.max_n_by(sub.mod, sub.div, 5),
  3. NULL::INT)
  4. FROM (
  5. SELECT (i * 13) % 10007 AS mod, (i * 13) / 10007 AS div
  6. FROM generate_series(1,10000) as i
  7. ) sub;
  8. into_values
  9. -------------
  10. (10006,3)
  11. (10005,7)
  12. (10004,11)
  13. (10003,2)
  14. (10002,6)

rollup()

ExperimentalExperimental TimescaleDB Toolkit functions are not suitable for production environments. They may have bugs and may cause data loss. Click to learn more.

Introduced in Toolkit v1.12.0

Hide content

`

  1. rollup(

`

  1. agg MaxNBy

`

  1. ) MaxNBy

`

This aggregate combines the aggregates generated by other maxnby aggregates and returns the maximum values, with associated data, found across all the aggregated data.

Required arguments

NameTypeDescription
aggMaxNByThe aggregates being combined

Returns

ColumnTypeDescription
rollupMaxNByAn aggregate over all of the contributing values.

Extended examples

This example assumes that you have a table of stock trades in this format:

  1. CREATE TABLE stock_sales(
  2. ts TIMESTAMPTZ,
  3. symbol TEXT,
  4. price FLOAT,
  5. volume INT
  6. );

Find the 10 largest transactions in the table, what time they occurred, and what symbol was being traded:

  1. SELECT
  2. (data).time,
  3. (data).symbol,
  4. value AS transaction
  5. FROM
  6. toolkit_experimental.into_values((
  7. SELECT toolkit_experimental.max_n_by(price * volume, stock_sales, 10)
  8. FROM stock_sales
  9. ),
  10. NULL::stock_sales);