4. Syntax

Github 来源:PRQL 浏览 110 扫码分享 2023-04-11 11:29:37

Syntax
- Summary
- Pipes
  - PRQL
  - SQL
  - PRQL
  - SQL
Lists
- PRQL
- SQL
- PRQL
- SQL
- PRQL
- SQL
Expressions
- PRQL
- SQL
Precedence and Parentheses
- PRQL
- SQL
- PRQL
- Error
Inner Transforms
- PRQL
- SQL
Comments
- PRQL
- SQL
Quoted identifiers
- PRQL
- SQL
- PRQL
- SQL
- PRQL
- SQL
- PRQL
- SQL
Quoting schemas
- PRQL
- SQL
Parameters
- PRQL
- SQL
Numbers
- PRQL
- SQL
Keywords
- PRQL
- SQL

Syntax

Summary

A summary of PRQL syntax

Syntax	Usage	Example
`\|`	Pipelines	`from employees \| select first_name`
`=`	Assigns & Aliases	`from e = employees` `derive total = (sum salary)`
`:`	Named args & Parameters	`interp low:0 1600 sat_score`
`[]`	Lists	`select [id, amount]`
`()`	Precedence & Parentheses	`derive celsius = (fahrenheit - 32) / 1.8`
`‘’` & `“”`	Strings	`derive name = ‘Mary’`
`</code></td><td><a href="$8a5bacd7e62abd36.md#quoted-identifiers">Quoted identifiers</a></td><td><code>selectfirst name`
`#`	Comments	`# A comment`
`@`	Dates & Times	`@2021-01-01`
`==`	Expressions	`filter a == b and c != d and e > f`
`==`	Self-equality in join	`join s=salaries [==id]`
`->`	Function definitions	`func add a b -> a + b`
`+`/`-`	Sort order	`sort [-amount, +date]`
`??`	Coalesce	`amount ?? 0`

Pipes

Pipes — the connection between transforms that make up a pipeline — can be either line breaks or a pipe character (|).

In almost all situations, line-breaks pipe the result of a line’s transform into the transform on the following line. For example, the filter transform operates on the result of from employees (which is just the employees table), and the select transform operates on the result of the filter transform.

PRQL

from employees
filter department == "Product"
select [first_name, last_name]

SQL

SELECT
  first_name,
  last_name
FROM
  employees
WHERE
  department = 'Product'

In the place of a line-break, it’s also possible to use the | character to pipe results, such that this is equivalent:

PRQL

from employees | filter department == "Product" | select [first_name, last_name]

SQL

SELECT
  first_name,
  last_name
FROM
  employees
WHERE
  department = 'Product'

A line-break doesn’t create a pipeline in a couple of cases:

within a list (e.g. the derive examples below),
when the following line is a new statement, which starts with a keyword of func, table or from.

Lists

Lists are represented with [], and can span multiple lines. A final trailing comma is optional.

PRQL

from numbers
derive [x = 1, y = 2]
derive [
  a = x,
  b = y
]
derive [
  c = a,
  d = b,
]

SQL

SELECT
  *,
  1 AS x,
  2 AS y,
  1 AS a,
  2 AS b,
  1 AS c,
  2 AS d
FROM
  numbers

Most transforms can take either a list or a single item, so these are equivalent:

PRQL

from employees
select [first_name]

SQL

SELECT
  first_name
FROM
  employees

PRQL

from employees
select first_name

SQL

SELECT
  first_name
FROM
  employees

Expressions

PRQL is made up of expressions, like 2 + 3 or ((1 + x) * y). In the example below, note the use of expressions to calculate the alias circumference and in the filter transform.

PRQL

from foo
select [
  circumference = diameter * 3.14159,
  color,
]
filter circumference > 10 and color != "red"

SQL

WITH table_1 AS (
  SELECT
    diameter * 3.14159 AS circumference,
    color
  FROM
    foo
)
SELECT
  circumference,
  color
FROM
  table_1 AS table_0
WHERE
  circumference > 10
  AND color <> 'red'

Precedence and Parentheses

Parentheses — () — are used to give precedence to inner expressions.

Note

We realize some of the finer points here are not intuitive. We are considering approaches to make this more intuitive — even at the cost of requiring more syntax in some circumstances. And we’re planning to make the error messages much better, so the compiler is there to help out.

Parentheses are required around:

Any nested function call containing a pipe, either the | symbol or a new line. “Nested” means within a transform; i.e. not just the main pipeline.
Any function call that isn’t a single item in a list or a pipeline, like sum distance in round 0 (sum distance)1.
A minus sign in a function argument, like in add (-1) (-3)
Inner transforms for group, window, and other transforms.

Parentheses are not required around expressions which use operators but no function call, like foo + bar.

Here’s a full rundown of times this applier:

PRQL

from employees
# Requires parentheses, because it's contains a pipe
derive is_proximate = (distance | in 0..20)
# Requires parentheses, because it's a function call
derive total_distance = (sum distance)
# `??` doesn't require parentheses, as it's not a function call
derive min_capped_distance = (min distance ?? 5)
# No parentheses needed, because no function call
derive travel_time = distance / 40
# No inner parentheses needed around `1+1` because no function call
derive distance_rounded_2_dp = (round 1+1 distance)
derive [
  # Requires parentheses, because it contains a pipe
  is_far = (distance | in 100..),
  # The left value of the range requires parentheses,
  # because of the minus sign
  is_negative = (distance | in (-100..0)),
  # ...this is equivalent
  is_negative = (distance | in (-100)..0),
  # Doesn't require parentheses, because it's in a list (confusing, see footnote)!
  average_distance = average distance,
]
# Requires parentheses because of the minus sign
sort (-distance)
# A list is fine too
sort [-distance]

SQL

SELECT
  *,
  distance BETWEEN 0 AND 20 AS is_proximate,
  SUM(distance) OVER () AS total_distance,
  MIN(COALESCE(distance, 5)) OVER () AS min_capped_distance,
  distance / 40 AS travel_time,
  ROUND(distance, 2) AS distance_rounded_2_dp,
  distance >= 100 AS is_far,
  distance BETWEEN -100 AND 0,
  distance BETWEEN -100 AND 0 AS is_negative,
  AVG(distance) OVER () AS average_distance
FROM
  employees
ORDER BY
  distance DESC

This doesn’t work, for example (though it should provide a much better error message):

PRQL

from employees
derive total_distance = sum distance

Error

Error:
   ╭─[:2:29]
   │
 2 │ derive total_distance = sum distance
   │                             ────┬───
   │                                 ╰───── Unknown name distance
───╯

For a more formal definition, refer to this precedence table. Because function calls have the lowest precedence, nested function calls or arguments that start or end with an operator require parenthesis.

Group	Operators	Precedence	Associativity
identifier dot	`.`	1
unary	`- + ! ==`	2
range	`..`	3
mul	`* / %`	4	left-to-right
add	`+ -`	5	left-to-right
compare	`== != <= >= < >`	6	left-to-right
coalesce	`??`	7	left-to-right
and	`and`	8	left-to-right
or	`or`	9	left-to-right
function call		10

Inner Transforms

Parentheses are also used for transforms (such as group and window) that pass their result to an “inner transform”. The example below applies the aggregate pipeline to each group of unique title and country values:

PRQL

from employees
group [title, country] (
  aggregate [
    average salary,
    ct = count
  ]
)

SQL

SELECT
  title,
  country,
  AVG(salary),
  COUNT(*) AS ct
FROM
  employees
GROUP BY
  title,
  country

Comments

Comments are represented by #.

PRQL

from employees  # Comment 1
# Comment 2
aggregate [average salary]

SQL

SELECT
  AVG(salary)
FROM
  employees

There’s no distinct multiline comment syntax.

Quoted identifiers

To use identifiers that are otherwise invalid, surround them with backticks. Depending on the dialect, these will remain as backticks or be converted to double-quotes.

PRQL

prql target:sql.mysql
from employees
select `first name`

SQL

SELECT
  `first name`
FROM
  employees

PRQL

prql target:sql.postgres
from employees
select `first name`

SQL

SELECT
  "first name"
FROM
  employees

PRQL

from `dir/*.parquet`

SQL

SELECT
  *
FROM
  "dir/*.parquet"

BigQuery also uses backticks to surround project & dataset names (even if valid identifiers) in the SELECT statement:

PRQL

prql target:sql.bigquery
from `project-foo.dataset.table`
join `project-bar.dataset.table` [==col_bax]

SQL

SELECT
  `project-foo.dataset.table`.*,
  `project-bar.dataset.table`.*
FROM
  `project-foo.dataset.table`
  JOIN `project-bar.dataset.table` ON `project-foo.dataset.table`.col_bax = `project-bar.dataset.table`.col_bax

Quoting schemas

Note

This is currently not great and we are working on improving it; see https://github.com/PRQL/prql/issues/1535 for progress.

If supplying a schema without a column — for example in a from or join transform, that also needs to be a quoted identifier:

PRQL

from `music.albums`

SQL

SELECT
  *
FROM
  music.albums

Parameters

PRQL will retain parameters like $1 in SQL output, which can then be supplied to the SQL query as a prepared query:

PRQL

from employees
filter id == $1

SQL

SELECT
  *
FROM
  employees
WHERE
  id = $1

Numbers

Numbers can contain underscores between numbers; which can make reading large numbers easier:

PRQL

from numbers
select [
    small = 1.000_000_1,
    big = 5_000_000,
]

SQL

SELECT
  1.0000001 AS small,
  5000000 AS big
FROM
  numbers

Keywords

At the moment, PRQL uses only four keywords:

prql
let
func
case

To use these names as columns or relations, use backticks: `case` .

It may seem that transforms are also keywords, but they are normal function within std namespace:

PRQL

std.from my_table
std.select [from = my_table.a, take = my_table.b]
std.take 3

SQL

SELECT
  a AS "from",
  b AS take
FROM
  my_table
LIMIT
  3

1: or, technically, it’s on the right side of an assignment in a list…

当前内容版权归 PRQL 或其关联方所有，如需对内容或内容相关联开源项目进行关注与资助，请访问 PRQL .

本文档使用 BookStack 构建

展开/收起文章目录