Create candlestick aggregates

Turning raw, real-time tick data into aggregated candlestick views is a common task for users who work with financial data. If your data is not tick data, for example if you receive it in an already aggregated form such as 1-min buckets, you can still use these functions to help you create additional aggregates of your data into larger buckets, such as 1-hour or 1-day buckets. If you want to work with pre-aggregated stock and crypto data, see the Analyzing Intraday Stock Data tutorial for more examples.

TimescaleDB includes hyperfunctions that you can use to store and query your financial data more easily. Hyperfunctions are SQL functions within TimescaleDB that make it easier to manipulate and analyze time-series data in PostgreSQL with fewer lines of code. There are three hyperfunctions that are essential for calculating candlestick values: time_bucket(), FIRST(), and LAST().

The time_bucket() hyperfunction helps you aggregate records into buckets of arbitrary time intervals based on the timestamp value. FIRST() and LAST() help you calculate the opening and closing prices. To calculate highest and lowest prices, you can use the standard PostgreSQL aggregate functions MIN and MAX.

In this first SQL example, use the hyperfunctions to query the tick data, and turn it into 1-min candlestick values in the candlestick format:

  1. -- Create the candlestick format
  2. SELECT
  3. time_bucket('1 min', time) AS bucket,
  4. symbol,
  5. FIRST(price, time) AS "open",
  6. MAX(price) AS high,
  7. MIN(price) AS low,
  8. LAST(price, time) AS "close",
  9. LAST(day_volume, time) AS day_volume
  10. FROM crypto_ticks
  11. GROUP BY bucket, symbol

Hyperfunctions in this query:

  • time_bucket('1 min', time): creates 1-minute buckets
  • FIRST(price, time): selects the first price value in the bucket, ordered by time, which is the opening price of the candlestick.
  • LAST(price, time) selects the last price value in the bucket, ordered by time, which is the closing price of the candlestick

Besides the hyperfunctions, you can see other common SQL aggregate functions like MIN and MAX, which calculate the lowest and highest prices in the candlestick.

note

This tutorial uses the LAST() hyperfunction to calculate the volume within a bucket, because the sample tick data already provides an incremental day_volume field which contains the total volume for the given day with each trade. Depending on the raw data you receive and whether you want to calculate volume in terms of trade count or the total value of the trades, you might need to use COUNT(*), SUM(price), or subtraction between the last and first values in the bucket to get the correct result.

Create continuous aggregates for candlestick data

In TimescaleDB, the most efficient way to create candlestick views is to use continuous aggregates. Continuous aggregates are very similar to PostgreSQL materialized views but with three major advantages.

First, materialized views recreate all of the data any time the view is refreshed, which causes history to be lost. Continuous aggregates only refresh the buckets of aggregated data where the source, raw data has been changed or added.

Second, continuous aggregates can be automatically refreshed using built-in, user-configured policies. No special triggers or stored procedures are needed to refresh the data over time.

Finally, continuous aggregates are real-time by default. Any new raw tick data that is inserted between refreshes is automatically appended to the materialized data. This keeps your candlestick data up-to-date without having to write special SQL to UNION data from multiple views and tables.

Continuous aggregates are often used to power dashboards and other user-facing applications, like price charts, where query performance and timeliness of your data matter.

Let’s see how to create different candlestick time buckets - 1 minute, 1 hour, and 1 day - using continuous aggregates with different refresh policies.

1-minute candlestick

To create a continuous aggregate of 1-minute candlestick data, use the same query that you previously used to get the 1-minute OHLCV values. But this time, put the query in a continuous aggregate definition:

  1. /* 1-min candlestick view*/
  2. CREATE MATERIALIZED VIEW one_min_candle
  3. WITH (timescaledb.continuous) AS
  4. SELECT
  5. time_bucket('1 min', time) AS bucket,
  6. symbol,
  7. FIRST(price, time) AS "open",
  8. MAX(price) AS high,
  9. MIN(price) AS low,
  10. LAST(price, time) AS "close",
  11. LAST(day_volume, time) AS day_volume
  12. FROM crypto_ticks
  13. GROUP BY bucket, symbol

When you run this query, TimescaleDB queries 1-minute aggregate values of all your tick data, creating the continuous aggregate and materializing the results. But your candlestick data has only been materialized up to the last data point. If you want the continuous aggregate to stay up to date as new data comes in over time, you also need to add a continuous aggregate refresh policy. For example, to refresh the continuous aggregate every two minutes:

  1. /* Refresh the continuous aggregate every two minutes */
  2. SELECT add_continuous_aggregate_policy('one_min_candle',
  3. start_offset => INTERVAL '2 hour',
  4. end_offset => INTERVAL '10 sec',
  5. schedule_interval => INTERVAL '2 min');

The continuous aggregate refreshes every hour, so every hour new candlesticks are materialized, if there’s new raw tick data in the hypertable.

When this job runs, it only refreshes the time period between start_offset and end_offset, and ignores modifications outside of this window.

In most cases, set end_offset to be the same or bigger as the time bucket in the continuous aggregate definition. This makes sure that only full buckets get materialized during the refresh process.

1-hour candlestick

To create a 1-hour candlestick view, follow the same process as in the previous step, except this time set the time bucket value to be one hour in the continuous aggregate definition:

  1. /* 1-hour candlestick view */
  2. CREATE MATERIALIZED VIEW one_hour_candle
  3. WITH (timescaledb.continuous) AS
  4. SELECT
  5. time_bucket('1 hour', time) AS bucket,
  6. symbol,
  7. FIRST(price, time) AS "open",
  8. MAX(price) AS high,
  9. MIN(price) AS low,
  10. LAST(price, time) AS "close",
  11. LAST(day_volume, time) AS day_volume
  12. FROM crypto_ticks
  13. GROUP BY bucket, symbol

Add a refresh policy to refresh the continuous aggregate every hour:

  1. /* Refresh the continuous aggregate every hour */
  2. SELECT add_continuous_aggregate_policy('one_hour_candle',
  3. start_offset => INTERVAL '1 day',
  4. end_offset => INTERVAL '1 min',
  5. schedule_interval => INTERVAL '1 hour');

Notice how this example uses a different refresh policy with different parameter values to accommodate the 1-hour time bucket in the continuous aggregate definition. The continuous aggregate will refresh every hour, so every hour there will be new candlestick data materialized, if there’s new raw tick data in the hypertable.

1-day candlestick

Create the final view in this tutorial for 1-day candlesticks using the same process as above, using a 1-day time bucket size:

  1. /* 1-day candlestick */
  2. CREATE MATERIALIZED VIEW one_day_candle
  3. WITH (timescaledb.continuous) AS
  4. SELECT
  5. time_bucket('1 day', time) AS bucket,
  6. symbol,
  7. FIRST(price, time) AS "open",
  8. MAX(price) AS high,
  9. MIN(price) AS low,
  10. LAST(price, time) AS "close",
  11. LAST(day_volume, time) AS day_volume
  12. FROM crypto_ticks
  13. GROUP BY bucket, symbol

Add a refresh policy to refresh the continuous aggregate once a day:

  1. /* Refresh the continuous aggregate every day */
  2. SELECT add_continuous_aggregate_policy('one_day_candle',
  3. start_offset => INTERVAL '3 day',
  4. end_offset => INTERVAL '1 day',
  5. schedule_interval => INTERVAL '1 day');

The refresh job runs every day, and materializes two days’ worth of candlesticks.

Optional: add price change (delta) column in the candlestick view

As an optional step, you can add an additional column in the continuous aggregate to calculate the price difference between the opening and closing price within the bucket.

In general, you can calculate the price difference with the formula:

  1. (CLOSE PRICE - OPEN PRICE) / OPEN PRICE = delta

Calculate delta in SQL:

  1. SELECT time_bucket('1 day', time) AS bucket, symbol, (LAST(price, time)-FIRST(price, time))/FIRST(price, time) AS change_pct
  2. FROM crypto_ticks
  3. WHERE price != 0
  4. GROUP BY bucket, symbol

The full continuous aggregate definition for a 1-day candlestick with a price-change column:

  1. /* 1-day candlestick with price change column*/
  2. CREATE MATERIALIZED VIEW one_day_candle_delta
  3. WITH (timescaledb.continuous) AS
  4. SELECT
  5. time_bucket('1 day', time) AS bucket,
  6. symbol,
  7. FIRST(price, time) AS "open",
  8. MAX(price) AS high,
  9. MIN(price) AS low,
  10. LAST(price, time) AS "close",
  11. LAST(day_volume, time) AS day_volume,
  12. (LAST(price, time)-FIRST(price, time))/FIRST(price, time) AS change_pct
  13. FROM crypto_ticks
  14. WHERE price != 0
  15. GROUP BY bucket, symbol

Using multiple continuous aggregates

You cannot currently create a continuous aggregate on top of another continuous aggregate. However, this is not necessary in most cases. You can get a similar result and performance by creating multiple continuous aggregates for the same hypertable. Due to the efficient materialization mechanism of continuous aggregates, both refresh and query performance should work well.