8.31. SELECT

Synopsis

  1. [ WITH with_query [, ...] ]
  2. SELECT [ ALL | DISTINCT ] select_expr [, ...]
  3. [ FROM from_item [, ...] ]
  4. [ WHERE condition ]
  5. [ GROUP BY [ ALL | DISTINCT ] grouping_element [, ...] ]
  6. [ HAVING condition]
  7. [ { UNION | INTERSECT | EXCEPT } [ ALL | DISTINCT ] select ]
  8. [ ORDER BY expression [ ASC | DESC ] [, ...] ]
  9. [ LIMIT [ count | ALL ] ]

where from_item is one of

  1. table_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
  1. from_item join_type from_item [ ON join_condition | USING ( join_column [, ...] ) ]

and join_type is one of

  1. [ INNER ] JOIN
  2. LEFT [ OUTER ] JOIN
  3. RIGHT [ OUTER ] JOIN
  4. FULL [ OUTER ] JOIN
  5. CROSS JOIN

and grouping_element is one of

  1. ()
  2. expression
  3. GROUPING SETS ( ( column [, ...] ) [, ...] )
  4. CUBE ( column [, ...] )
  5. ROLLUP ( column [, ...] )

Description

Retrieve rows from zero or more tables.

WITH Clause

The WITH clause defines named relations for use within a query.It allows flattening nested queries or simplifying subqueries.For example, the following queries are equivalent:

  1. SELECT a, b
  2. FROM (
  3. SELECT a, MAX(b) AS b FROM t GROUP BY a
  4. ) AS x;
  5.  
  6. WITH x AS (SELECT a, MAX(b) AS b FROM t GROUP BY a)
  7. SELECT a, b FROM x;

This also works with multiple subqueries:

  1. WITH
  2. t1 AS (SELECT a, MAX(b) AS b FROM x GROUP BY a),
  3. t2 AS (SELECT a, AVG(d) AS d FROM y GROUP BY a)
  4. SELECT t1.*, t2.*
  5. FROM t1
  6. JOIN t2 ON t1.a = t2.a;

Additionally, the relations within a WITH clause can chain:

  1. WITH
  2. x AS (SELECT a FROM t),
  3. y AS (SELECT a AS b FROM x),
  4. z AS (SELECT b AS c FROM y)
  5. SELECT c FROM z;

Warning

Currently, the SQL for the WITH clause will be inlined anywhere the namedrelation is used. This means that if the relation is used more than once and the queryis non-deterministic, the results may be different each time.

GROUP BY Clause

The GROUP BY clause divides the output of a SELECT statement intogroups of rows containing matching values. A simple GROUP BY clause maycontain any expression composed of input columns or it may be an ordinalnumber selecting an output column by position (starting at one).

The following queries are equivalent. They both group the output bythe nationkey input column with the first query using the ordinalposition of the output column and the second query using the inputcolumn name:

  1. SELECT count(*), nationkey FROM customer GROUP BY 2;
  2.  
  3. SELECT count(*), nationkey FROM customer GROUP BY nationkey;

GROUP BY clauses can group output by input column names not appearing inthe output of a select statement. For example, the following query generatesrow counts for the customer table using the input column mktsegment:

  1. SELECT count(*) FROM customer GROUP BY mktsegment;

  1. _col0

29968 30142 30189 29949 29752(5 rows)

When a GROUP BY clause is used in a SELECT statement all outputexpressions must be either aggregate functions or columns present inthe GROUP BY clause.

Complex Grouping Operations

Presto also supports complex aggregations using the GROUPING SETS, CUBEand ROLLUP syntax. This syntax allows users to perform analysis that requiresaggregation on multiple sets of columns in a single query. Complex groupingoperations do not support grouping on expressions composed of input columns.Only column names or ordinals are allowed.

Complex grouping operations are often equivalent to a UNION ALL of simpleGROUP BY expressions, as shown in the following examples. This equivalencedoes not apply, however, when the source of data for the aggregationis non-deterministic.

GROUPING SETS

Grouping sets allow users to specify multiple lists of columns to group on.The columns not part of a given sublist of grouping columns are set to NULL.

  1. SELECT * FROM shipping;
  1. origin_state | origin_zip | destination_state | destination_zip | package_weight
  2. --------------+------------+-------------------+-----------------+----------------
  3. California | 94131 | New Jersey | 8648 | 13
  4. California | 94131 | New Jersey | 8540 | 42
  5. New Jersey | 7081 | Connecticut | 6708 | 225
  6. California | 90210 | Connecticut | 6927 | 1337
  7. California | 94131 | Colorado | 80302 | 5
  8. New York | 10002 | New Jersey | 8540 | 3
  9. (6 rows)

GROUPING SETS semantics are demonstrated by this example query:

  1. SELECT origin_state, origin_zip, destination_state, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS (
  4. (origin_state),
  5. (origin_state, origin_zip),
  6. (destination_state));
  1. origin_state | origin_zip | destination_state | _col0
  2. --------------+------------+-------------------+-------
  3. New Jersey | NULL | NULL | 225
  4. California | NULL | NULL | 1397
  5. New York | NULL | NULL | 3
  6. California | 90210 | NULL | 1337
  7. California | 94131 | NULL | 60
  8. New Jersey | 7081 | NULL | 225
  9. New York | 10002 | NULL | 3
  10. NULL | NULL | Colorado | 5
  11. NULL | NULL | New Jersey | 58
  12. NULL | NULL | Connecticut | 1562
  13. (10 rows)

The preceding query may be considered logically equivalent to a UNION ALL ofmultiple GROUP BY queries:

  1. SELECT origin_state, NULL, NULL, sum(package_weight)
  2. FROM shipping GROUP BY origin_state
  3.  
  4. UNION ALL
  5.  
  6. SELECT origin_state, origin_zip, NULL, sum(package_weight)
  7. FROM shipping GROUP BY origin_state, origin_zip
  8.  
  9. UNION ALL
  10.  
  11. SELECT NULL, NULL, destination_state, sum(package_weight)
  12. FROM shipping GROUP BY destination_state;

However, the query with the complex grouping syntax (GROUPING SETS, CUBEor ROLLUP) will only read from the underlying data source once, while thequery with the UNION ALL reads the underlying data three times. This is whyqueries with a UNION ALL may produce inconsistent results when the datasource is not deterministic.

CUBE

The CUBE operator generates all possible grouping sets (i.e. a power set)for a given set of columns. For example, the query:

  1. SELECT origin_state, destination_state, sum(package_weight)
  2. FROM shipping
  3. GROUP BY CUBE (origin_state, destination_state);

is equivalent to:

  1. SELECT origin_state, destination_state, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS (
  4. (origin_state, destination_state),
  5. (origin_state),
  6. (destination_state),
  7. ());
  1. origin_state | destination_state | _col0
  2. --------------+-------------------+-------
  3. California | New Jersey | 55
  4. California | Colorado | 5
  5. New York | New Jersey | 3
  6. New Jersey | Connecticut | 225
  7. California | Connecticut | 1337
  8. California | NULL | 1397
  9. New York | NULL | 3
  10. New Jersey | NULL | 225
  11. NULL | New Jersey | 58
  12. NULL | Connecticut | 1562
  13. NULL | Colorado | 5
  14. NULL | NULL | 1625
  15. (12 rows)

ROLLUP

The ROLLUP operator generates all possible subtotals for a given set ofcolumns. For example, the query:

  1. SELECT origin_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY ROLLUP (origin_state, origin_zip);
  1. origin_state | origin_zip | _col2
  2. --------------+------------+-------
  3. California | 94131 | 60
  4. California | 90210 | 1337
  5. New Jersey | 7081 | 225
  6. New York | 10002 | 3
  7. California | NULL | 1397
  8. New York | NULL | 3
  9. New Jersey | NULL | 225
  10. NULL | NULL | 1625
  11. (8 rows)

is equivalent to:

  1. SELECT origin_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS ((origin_state, origin_zip), (origin_state), ());

Combining multiple grouping expressions

Multiple grouping expressions in the same query are interpreted as havingcross-product semantics. For example, the following query:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY
  4. GROUPING SETS ((origin_state, destination_state)),
  5. ROLLUP (origin_zip);

which can be rewritten as:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY
  4. GROUPING SETS ((origin_state, destination_state)),
  5. GROUPING SETS ((origin_zip), ());

is logically equivalent to:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS (
  4. (origin_state, destination_state, origin_zip),
  5. (origin_state, destination_state));
  1. origin_state | destination_state | origin_zip | _col3
  2. --------------+-------------------+------------+-------
  3. New York | New Jersey | 10002 | 3
  4. California | New Jersey | 94131 | 55
  5. New Jersey | Connecticut | 7081 | 225
  6. California | Connecticut | 90210 | 1337
  7. California | Colorado | 94131 | 5
  8. New York | New Jersey | NULL | 3
  9. New Jersey | Connecticut | NULL | 225
  10. California | Colorado | NULL | 5
  11. California | Connecticut | NULL | 1337
  12. California | New Jersey | NULL | 55
  13. (10 rows)

The ALL and DISTINCT quantifiers determine whether duplicate groupingsets each produce distinct output rows. This is particularly useful whenmultiple complex grouping sets are combined in the same query. For example, thefollowing query:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY ALL
  4. CUBE (origin_state, destination_state),
  5. ROLLUP (origin_state, origin_zip);

is equivalent to:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS (
  4. (origin_state, destination_state, origin_zip),
  5. (origin_state, origin_zip),
  6. (origin_state, destination_state, origin_zip),
  7. (origin_state, origin_zip),
  8. (origin_state, destination_state),
  9. (origin_state),
  10. (origin_state, destination_state),
  11. (origin_state),
  12. (origin_state, destination_state),
  13. (origin_state),
  14. (destination_state),
  15. ());

However, if the query uses the DISTINCT quantifier for the GROUP BY:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY DISTINCT
  4. CUBE (origin_state, destination_state),
  5. ROLLUP (origin_state, origin_zip);

only unique grouping sets are generated:

  1. SELECT origin_state, destination_state, origin_zip, sum(package_weight)
  2. FROM shipping
  3. GROUP BY GROUPING SETS (
  4. (origin_state, destination_state, origin_zip),
  5. (origin_state, origin_zip),
  6. (origin_state, destination_state),
  7. (origin_state),
  8. (destination_state),
  9. ());

The default set quantifier is ALL.

GROUPING Operation

grouping(col1, …, colN) -> bigint

The grouping operation returns a bit set converted to decimal, indicating which columns are present in agrouping. It must be used in conjunction with GROUPING SETS, ROLLUP, CUBE or GROUP BYand its arguments must match exactly the columns referenced in the corresponding GROUPING SETS,ROLLUP, CUBE or GROUP BY clause.

To compute the resulting bit set for a particular row, bits are assigned to the argument columns withthe rightmost column being the least significant bit. For a given grouping, a bit is set to 0 if thecorresponding column is included in the grouping and to 1 otherwise. For example, consider the querybelow:

  1. SELECT origin_state, origin_zip, destination_state, sum(package_weight),
  2. grouping(origin_state, origin_zip, destination_state)
  3. FROM shipping
  4. GROUP BY GROUPING SETS (
  5. (origin_state),
  6. (origin_state, origin_zip),
  7. (destination_state));
  1. origin_state | origin_zip | destination_state | _col3 | _col4
  2. --------------+------------+-------------------+-------+-------
  3. California | NULL | NULL | 1397 | 3
  4. New Jersey | NULL | NULL | 225 | 3
  5. New York | NULL | NULL | 3 | 3
  6. California | 94131 | NULL | 60 | 1
  7. New Jersey | 7081 | NULL | 225 | 1
  8. California | 90210 | NULL | 1337 | 1
  9. New York | 10002 | NULL | 3 | 1
  10. NULL | NULL | New Jersey | 58 | 6
  11. NULL | NULL | Connecticut | 1562 | 6
  12. NULL | NULL | Colorado | 5 | 6
  13. (10 rows)

The first grouping in the above result only includes the origin_state column and excludesthe origin_zip and destination_state columns. The bit set constructed for that groupingis 011 where the most significant bit represents origin_state.

HAVING Clause

The HAVING clause is used in conjunction with aggregate functions andthe GROUP BY clause to control which groups are selected. A HAVINGclause eliminates groups that do not satisfy the given conditions.HAVING filters groups after groups and aggregates are computed.

The following example queries the customer table and selects groupswith an account balance greater than the specified value:

  1. SELECT count(*), mktsegment, nationkey,
  2. CAST(sum(acctbal) AS bigint) AS totalbal
  3. FROM customer
  4. GROUP BY mktsegment, nationkey
  5. HAVING sum(acctbal) > 5700000
  6. ORDER BY totalbal DESC;
  1. _col0 | mktsegment | nationkey | totalbal
  2. -------+------------+-----------+----------
  3. 1272 | AUTOMOBILE | 19 | 5856939
  4. 1253 | FURNITURE | 14 | 5794887
  5. 1248 | FURNITURE | 9 | 5784628
  6. 1243 | FURNITURE | 12 | 5757371
  7. 1231 | HOUSEHOLD | 3 | 5753216
  8. 1251 | MACHINERY | 2 | 5719140
  9. 1247 | FURNITURE | 8 | 5701952
  10. (7 rows)

UNION | INTERSECT | EXCEPT Clause

UNION INTERSECT and EXCEPT are all set operations. These clauses are usedto combine the results of more than one select statement into a single result set:

  1. query UNION [ALL | DISTINCT] query
  1. query INTERSECT [DISTINCT] query
  1. query EXCEPT [DISTINCT] query

The argument ALL or DISTINCT controls which rows are included inthe final result set. If the argument ALL is specified all rows areincluded even if the rows are identical. If the argument DISTINCTis specified only unique rows are included in the combined result set.If neither is specified, the behavior defaults to DISTINCT. The ALLargument is not supported for INTERSECT or EXCEPT.

Multiple set operations are processed left to right, unless the order is explicitlyspecified via parentheses. Additionally, INTERSECT binds more tightlythan EXCEPT and UNION. That means A UNION B INTERSECT C EXCEPT Dis the same as A UNION (B INTERSECT C) EXCEPT D.

UNION

UNION combines all the rows that are in the result set from thefirst query with those that are in the result set for the second query.The following is an example of one of the simplest possible UNION clauses.It selects the value 13 and combines this result set with a second querythat selects the value 42:

  1. SELECT 13
  2. UNION
  3. SELECT 42;

  1. _col0

  1. 13
  2. 42

(2 rows)

The following query demonstrates the difference between UNION and UNION ALL.It selects the value 13 and combines this result set with a second query thatselects the values 42 and 13:

  1. SELECT 13
  2. UNION
  3. SELECT * FROM (VALUES 42, 13);

  1. _col0

  1. 13
  2. 42

(2 rows)

  1. SELECT 13
  2. UNION ALL
  3. SELECT * FROM (VALUES 42, 13);

  1. _col0

  1. 13
  2. 42
  3. 13

(2 rows)

INTERSECT

INTERSECT returns only the rows that are in the result sets of both the first andthe second queries. The following is an example of one of the simplestpossible INTERSECT clauses. It selects the values 13 and 42 and combinesthis result set with a second query that selects the value 13. Since 42is only in the result set of the first query, it is not included in the final results.:

  1. SELECT * FROM (VALUES 13, 42)
  2. INTERSECT
  3. SELECT 13;

  1. _col0

  1. 13

(2 rows)

EXCEPT

EXCEPT returns the rows that are in the result set of the first query,but not the second. The following is an example of one of the simplestpossible EXCEPT clauses. It selects the values 13 and 42 and combinesthis result set with a second query that selects the value 13. Since 13is also in the result set of the second query, it is not included in the final result.:

  1. SELECT * FROM (VALUES 13, 42)
  2. EXCEPT
  3. SELECT 13;

  1. _col0

42(2 rows)

ORDER BY Clause

The ORDER BY clause is used to sort a result set by one or moreoutput expressions:

  1. ORDER BY expression [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...]

Each expression may be composed of output columns or it may be an ordinalnumber selecting an output column by position (starting at one). TheORDER BY clause is evaluated as the last step of a query after anyGROUP BY or HAVING clause. The default null ordering is NULLS LAST,regardless of the ordering direction.

LIMIT Clause

The LIMIT clause restricts the number of rows in the result set.LIMIT ALL is the same as omitting the LIMIT clause.The following example queries a large table, but the limit clause restrictsthe output to only have five rows (because the query lacks an ORDER BY,exactly which rows are returned is arbitrary):

  1. SELECT orderdate FROM orders LIMIT 5;

  1. o_orderdate

1996-04-14 1992-01-15 1995-02-01 1995-11-12 1992-04-26(5 rows)

TABLESAMPLE

There are multiple sample methods:

  • BERNOULLI
  • Each row is selected to be in the table sample with a probability ofthe sample percentage. When a table is sampled using the Bernoullimethod, all physical blocks of the table are scanned and certainrows are skipped (based on a comparison between the sample percentageand a random value calculated at runtime).

The probability of a row being included in the result is independentfrom any other row. This does not reduce the time required to readthe sampled table from disk. It may have an impact on the totalquery time if the sampled output is processed further.

  • SYSTEM
  • This sampling method divides the table into logical segments of dataand samples the table at this granularity. This sampling method eitherselects all the rows from a particular segment of data or skips it(based on a comparison between the sample percentage and a randomvalue calculated at runtime).

The rows selected in a system sampling will be dependent on whichconnector is used. For example, when used with Hive, it is dependenton how the data is laid out on HDFS. This method does not guaranteeindependent sampling probabilities.

Note

Neither of the two methods allow deterministic bounds on the number of rows returned.

Examples:

  1. SELECT *
  2. FROM users TABLESAMPLE BERNOULLI (50);
  3.  
  4. SELECT *
  5. FROM users TABLESAMPLE SYSTEM (75);

Using sampling with joins:

  1. SELECT o.*, i.*
  2. FROM orders o TABLESAMPLE SYSTEM (10)
  3. JOIN lineitem i TABLESAMPLE BERNOULLI (40)
  4. ON o.orderkey = i.orderkey;

UNNEST

UNNEST can be used to expand an ARRAY or MAP into a relation.Arrays are expanded into a single column, and maps are expanded into two columns (key, value).UNNEST can also be used with multiple arguments, in which case they are expanded into multiple columns,with as many rows as the highest cardinality argument (the other columns are padded with nulls).UNNEST can optionally have a WITH ORDINALITY clause, in which case an additional ordinality columnis added to the end.UNNEST is normally used with a JOIN and can reference columnsfrom relations on the left side of the join.

Using a single array column:

  1. SELECT student, score
  2. FROM tests
  3. CROSS JOIN UNNEST(scores) AS t (score);

Using multiple array columns:

  1. SELECT numbers, animals, n, a
  2. FROM (
  3. VALUES
  4. (ARRAY[2, 5], ARRAY['dog', 'cat', 'bird']),
  5. (ARRAY[7, 8, 9], ARRAY['cow', 'pig'])
  6. ) AS x (numbers, animals)
  7. CROSS JOIN UNNEST(numbers, animals) AS t (n, a);
  1. numbers | animals | n | a
  2. -----------+------------------+------+------
  3. [2, 5] | [dog, cat, bird] | 2 | dog
  4. [2, 5] | [dog, cat, bird] | 5 | cat
  5. [2, 5] | [dog, cat, bird] | NULL | bird
  6. [7, 8, 9] | [cow, pig] | 7 | cow
  7. [7, 8, 9] | [cow, pig] | 8 | pig
  8. [7, 8, 9] | [cow, pig] | 9 | NULL
  9. (6 rows)

WITH ORDINALITY clause:

  1. SELECT numbers, n, a
  2. FROM (
  3. VALUES
  4. (ARRAY[2, 5]),
  5. (ARRAY[7, 8, 9])
  6. ) AS x (numbers)
  7. CROSS JOIN UNNEST(numbers) WITH ORDINALITY AS t (n, a);
  1. numbers | n | a
  2. -----------+---+---
  3. [2, 5] | 2 | 1
  4. [2, 5] | 5 | 2
  5. [7, 8, 9] | 7 | 1
  6. [7, 8, 9] | 8 | 2
  7. [7, 8, 9] | 9 | 3
  8. (5 rows)

Using a single map column:

  1. SELECT
  2. animals, a, n
  3. FROM (
  4. VALUES
  5. (MAP(ARRAY['dog', 'cat', 'bird'], ARRAY[1, 2, 0])),
  6. (MAP(ARRAY['dog', 'cat'], ARRAY[4, 5]))
  7. ) AS x (animals)
  8. CROSS JOIN UNNEST(animals) AS t (a, n);
  1. animals | a | n
  2. ----------------------------+------+---
  3. {"cat":2,"bird":0,"dog":1} | dog | 1
  4. {"cat":2,"bird":0,"dog":1} | cat | 2
  5. {"cat":2,"bird":0,"dog":1} | bird | 0
  6. {"cat":5,"dog":4} | dog | 4
  7. {"cat":5,"dog":4} | cat | 5
  8. (5 rows)

Joins

Joins allow you to combine data from multiple relations.

CROSS JOIN

A cross join returns the Cartesian product (all combinations) of tworelations. Cross joins can either be specified using the explitCROSS JOIN syntax or by specifying multiple relations in theFROM clause.

Both of the following queries are equivalent:

  1. SELECT *
  2. FROM nation
  3. CROSS JOIN region;
  4.  
  5. SELECT *
  6. FROM nation, region;

The nation table contains 25 rows and the region table contains 5 rows,so a cross join between the two tables produces 125 rows:

  1. SELECT n.name AS nation, r.name AS region
  2. FROM nation AS n
  3. CROSS JOIN region AS r
  4. ORDER BY 1, 2;
  1. nation | region
  2. ----------------+-------------
  3. ALGERIA | AFRICA
  4. ALGERIA | AMERICA
  5. ALGERIA | ASIA
  6. ALGERIA | EUROPE
  7. ALGERIA | MIDDLE EAST
  8. ARGENTINA | AFRICA
  9. ARGENTINA | AMERICA
  10. ...
  11. (125 rows)

Qualifying Column Names

When two relations in a join have columns with the same name, the columnreferences must be qualified using the relation alias (if the relationhas an alias), or with the relation name:

  1. SELECT nation.name, region.name
  2. FROM nation
  3. CROSS JOIN region;
  4.  
  5. SELECT n.name, r.name
  6. FROM nation AS n
  7. CROSS JOIN region AS r;
  8.  
  9. SELECT n.name, r.name
  10. FROM nation n
  11. CROSS JOIN region r;

The following query will fail with the error Column 'name' is ambiguous:

  1. SELECT name
  2. FROM nation
  3. CROSS JOIN region;

USING

The USING clause allows you to write shorter queries when both tables youare joining have the same name for the join key.

For example:

  1. SELECT *
  2. FROM table_1
  3. JOIN table_2
  4. ON table_1.key_A = table_2.key_A AND table_1.key_B = table_2.key_B

can be rewritten to:

  1. SELECT *
  2. FROM table_1
  3. JOIN table_2
  4. USING (key_A, key_B)

The output of doing JOIN with USING will be one copy of the join keycolumns (key_A and key_B in the example above) followed by the remaining columnsin table_1 and then the remaining columns in table_2. Note that the join keys are notincluded in the list of columns from the origin tables for the purpose ofreferencing them in the query. You cannot access them with a table prefix andif you run SELECT table_1., table_2., the join columns are not included in the output.

The following two queries are equivalent:

  1. SELECT *
  2. FROM (
  3. VALUES
  4. (1, 3, 10),
  5. (2, 4, 20)
  6. ) AS table_1 (key_A, key_B, y1)
  7. LEFT JOIN (
  8. VALUES
  9. (1, 3, 100),
  10. (2, 4, 200)
  11. ) AS table_2 (key_A, key_B, y2)
  12. USING (key_A, key_B)
  13.  
  14. -----------------------------
  15.  
  16. SELECT key_A, key_B, table_1.*, table_2.*
  17. FROM (
  18. VALUES
  19. (1, 3, 10),
  20. (2, 4, 20)
  21. ) AS table_1 (key_A, key_B, y1)
  22. LEFT JOIN (
  23. VALUES
  24. (1, 3, 100),
  25. (2, 4, 200)
  26. ) AS table_2 (key_A, key_B, y2)
  27. USING (key_A, key_B)

And produce the output:

  1. key_A | key_B | y1 | y2
  2. -------+-------+----+-----
  3. 1 | 3 | 10 | 100
  4. 2 | 4 | 20 | 200
  5. (2 rows)

Subqueries

A subquery is an expression which is composed of a query. The subqueryis correlated when it refers to columns outside of the subquery.Logically, the subquery will be evaluated for each row in the surroundingquery. The referenced columns will thus be constant during any singleevaluation of the subquery.

Note

Support for correlated subqueries is limited. Not every standard form is supported.

EXISTS

The EXISTS predicate determines if a subquery returns any rows:

  1. SELECT name
  2. FROM nation
  3. WHERE EXISTS (SELECT * FROM region WHERE region.regionkey = nation.regionkey)

IN

The IN predicate determines if any values produced by the subqueryare equal to the provided expression. The result of IN follows thestandard rules for nulls. The subquery must produce exactly one column:

  1. SELECT name
  2. FROM nation
  3. WHERE regionkey IN (SELECT regionkey FROM region)

Scalar Subquery

A scalar subquery is a non-correlated subquery that returns zero orone row. It is an error for the subquery to produce more than onerow. The returned value is NULL if the subquery produces no rows:

  1. SELECT name
  2. FROM nation
  3. WHERE regionkey = (SELECT max(regionkey) FROM region)

Note

Currently only single column can be returned from the scalar subquery.