SQL - SAMPLE BY - 《QuestDB v6.0 Documentation》

Syntax
Sample calculation
Examples

title	sidebar_label	description
SAMPLE BY keyword	SAMPLE BY	SAMPLE BY SQL keyword reference documentation.

SAMPLE BY is used on time series data to summarize large datasets into aggregates of homogeneous time chunks as part of a SELECT statement. Users performing SAMPLE BY queries on datasets with missing data may make use of the FILL keyword to specify a fill behavior.

:::note

To use SAMPLE BY, one column needs to be designated as timestamp. Find out more in the designated timestamp section.

:::

Syntax

Where SAMPLE_SIZE is the unit of time by which you wish to aggregate your results, and n is the number of time chunks that will be summarized together.

Sample calculation

The time range of each group sampled by time is an absolute value, in other words, sampling by one day is a 24 hour range which is not bound to actual calendar dates.

Considering the following example which samples from a sensors table over the last 24 hours:

SELECT ts, count()
FROM sensors
WHERE ts > dateadd('d', -1, now())
SAMPLE BY 1d

The WHERE clause has narrowed down results to those which have a timestamp greater than 24 hours from now. If the table has rows for sensor readings ingested yesterday and today, this query will return two rows. The 24 hour range for the sampled group starts at the first-returned timestamp:

ts	count
2021-02-03T00:32:35.000000Z	1000
2021-02-04T00:32:35.000000Z	200

Examples

Assume the following table

ts	buysell	quantity	price
ts1	B	q1	p1
ts2	S	q2	p2
ts3	S	q3	p3
…	…	…	…
tsn	B	qn	pn

The following will return the number of trades per hour:

SELECT ts, count()
FROM TRADES
SAMPLE BY 1h;

The following will return the trade volume in 30 minute intervals

SELECT ts, sum(quantity*price)
FROM TRADES
SAMPLE BY 30m;

The following will return the average trade notional (where notional is = q * p) by day:

SELECT ts, avg(quantity*price)
FROM TRADES
SAMPLE BY 1d;