titlesidebar_labeldescription
JOIN keyword
JOIN
JOIN SQL keyword reference documentation.

QuestDB supports the following types of joins: INNER, OUTER, CROSS, ASOF and SPLICE. FULL joins are not yet implemented and are on our roadmap. All supported join types can be combined in a single SQL statement; QuestDB SQL’s optimiser determines the best execution order and algorithms.

There are no known limitations on size of tables or sub-queries participating in joins and there are no limitations on number of joins either.

Join syntax

Flow chart showing the syntax of the JOIN keyword

Following data join columns from joined tables are combined in single row. Same name columns originating from different tables will be automatically aliased to create unique column namespace of the result set.

Though it is a best practice to diligently specify join conditions, QuestDB will also analyse WHERE clause for implicit join condition and will derive transient join conditions where necessary.

:::tip

When tables are joined on column that has the same name in both tables you can use shorthand ON (column) clause

:::

Execution order

Join operations are performed in order of their appearance in a SQL query. The following query performs a join on a table with one million rows based on a column from a smaller table with one hundred rows:

  1. SELECT * FROM 1_million_rows
  2. INNER JOIN 1_hundred_rows
  3. ON 1_million_rows.customer_id = 1_hundred_rows.referral_id;

The performance of this query can be improved by rewriting the query as follows:

  1. SELECT * FROM 1_hundred_rows
  2. INNER JOIN 1_million_rows
  3. ON 1_million_rows.referral_id = 1_hundred_rows.customer_id;

Implicit joins

It is possible to join two tables using the following syntax:

  1. SELECT *
  2. FROM a, b
  3. WHERE a.id = b.id;

The type of join as well as the column will be inferred from the where clause, and may end up being either INNER or CROSS join. For the example above, the equivalent explicit statement would be:

  1. SELECT *
  2. FROM a
  3. JOIN b ON (id);

(INNER) JOIN

Overview

(INNER) JOIN is used to return rows from 2 tables where the records on the compared column have matching values in both tables

Examples

The following query will return the movieId and the average rating from table ratings. It will also add a column for the title from table movies. The corresponding title will be identified based on the movieId in the ratings table matching an id in the movies table.

  1. SELECT movieId a, title, avg(rating)
  2. FROM ratings
  3. INNER JOIN (SELECT movieId id, title FROM movies)
  4. ON ratings.movieId = id;

By default JOIN is interpreted as INNER JOIN. Therefore INNER does not need to be specified.

  1. SELECT movieId a, title, avg(rating)
  2. FROM ratings
  3. JOIN (SELECT movieId id, title FROM movies)
  4. ON ratings.movieId = id;

OUTER JOIN

Overview

OUTER JOIN will return all records from the LEFT table, and if matched, the records of the RIGHT table. When there is no match for the RIGHT table, it will return NULL values in right table fields

Examples

General syntax is as follows:

  1. SELECT tab1.colA, tab2.colB
  2. FROM table1 tab1
  3. OUTER JOIN table2 tab2
  4. ON tab1.colA = tab2.colB;

OUTER JOIN query can also be used to select all rows in left table that do not exist in right table.

  1. SELECT tab1.colA, tab2.colB
  2. FROM table1 tab1
  3. OUTER JOIN table2 tab2
  4. ON tab1.colA = tab2.colB
  5. WHERE tab2.colB = NULL;

CROSS JOIN

Overview

CROSS JOIN will return the cartesian product of the two tables being joined. It can be used to a table with all possible combinations.

:::note

CROSS JOIN does not have ON clause.

:::

Example

The following will return all possible combinations of starters and deserts

  1. SELECT *
  2. FROM starters
  3. CROSS JOIN deserts;

ASOF JOIN

Overview

ASOF joins are used on time series data to join two tables based on timestamp where timestamps do not exactly match. For a given record at a given timestamp, it will return the corresponding record in the other table at the closest timestamp prior to the timestamp in the first table.

:::note

To be able to leverage ASOF JOIN, both joined table must have a designated timestamp column. To designate a column as timestamp, please refer to the CREATE TABLE section.

:::

ASOF join is performed on tables or result sets that are ordered by time. When table is created as ordered by time order of records is enforced and timestamp column name is in table metadata. ASOF join will use timestamp column from metadata.

Example

Consider the following tables.

tsask
2019-10-17T00:00:00.000000Z100
2019-10-17T00:00:00.200000Z101
2019-10-17T00:00:00.400000Z102
tsbid
2019-10-17T00:00:00.100000Z101
2019-10-17T00:00:00.300000Z102
2019-10-17T00:00:00.500000Z103

Therefore the following query:

  1. SELECT bids.ts timebid, bid, ask
  2. FROM bids
  3. ASOF JOIN asks;

Will return the following:

timebidbidask
2019-10-17T00:00:00.100000Z101100
2019-10-17T00:00:00.300000Z102101
2019-10-17T00:00:00.500000Z103102

:::note

There is no ASKS at timestamp 2019-10-17T00:00:00.100000Z. The ASOF JOIN will look for the value in the BIDS table that has the closest timestamp inferior or equal to the target timestamp.

:::

In case tables do not have designated timestamp column, but data is in chronological order, timestamp columns can be specified at runtime:

  1. SELECT bids.ts timebid, bid, ask
  2. FROM (bids timestamp(ts))
  3. ASOF JOIN (asks timestamp (ts));

:::caution

ASOF join does not check timestamp order, if data is not in chronological order join result is non-deterministic

:::

Above query assumes that there is only one instrument in BIDS and ASKS tables and therefore does not use the optional ON clause.

If both tables store data for multiple instruments ON clause will allow you to find bids for asks with matching instrument value.

  1. SELECT *
  2. FROM asks
  3. ASOF JOIN bids ON (instrument);

SPLICE JOIN

Overview

SPLICE JOIN is a full ASOF JOIN. It will return all the records from both tables. For each record from left table splice join will find prevailing record from right table and for each record from right table - prevailing record from left table.

Examples

Considering the following tables.

tsask
2019-10-17T00:00:00.000000Z100
2019-10-17T00:00:00.200000Z101
2019-10-17T00:00:00.400000Z102
tsbid
2019-10-17T00:00:00.100000Z101
2019-10-17T00:00:00.300000Z102
2019-10-17T00:00:00.500000Z103

This query:

  1. SELECT bids.ts timebid, bid, ask
  2. FROM bids
  3. SPLICE JOIN asks;

Will return the following results

timebidbidask
nullnull100
2019-10-17T00:00:00.100000Z101100
2019-10-17T00:00:00.100000Z101101
2019-10-17T00:00:00.300000Z102101
2019-10-17T00:00:00.300000Z102102
2019-10-17T00:00:00.500000Z103102

Note that the above query does not use the optional ON clause. In case you need additional filtering on the two tables, you can use the ON clause as follows:

  1. SELECT ts timebid, instrument bidInstrument, bid, ask
  2. FROM bids
  3. SPLICE JOIN
  4. (
  5. SELECT ts timesask, instrument askInstrument, ask ask
  6. FROM asks
  7. )
  8. ON bidInstrument=askInstrument;