Pipelines

PRQL queries are a sequence of lines (or transforms) that form a pipeline. Each line transforms the data, and passes its result to the next.

The simplest pipeline is just:

PRQL

  1. from employees

SQL

  1. SELECT
  2. *
  3. FROM
  4. employees

Adding transforms

As we add additional lines, each one transforms the result:

PRQL

  1. from employees
  2. derive gross_salary = (salary + payroll_tax)

SQL

  1. SELECT
  2. *,
  3. salary + payroll_tax AS gross_salary
  4. FROM
  5. employees

…and so on:

Compiling to SQL

PRQL compiles the query to SQL. The PRQL compiler tries to represent as many transforms as possible with a single SELECT statement. When necessary, the compiler “overflows” and creates CTEs (common table expressions):

PRQL

  1. from e = employees
  2. derive gross_salary = (salary + payroll_tax)
  3. sort gross_salary
  4. take 10
  5. join d = department [==dept_no]
  6. select [e.name, gross_salary, d.name]

SQL

  1. WITH table_1 AS (
  2. SELECT
  3. name,
  4. salary + payroll_tax AS gross_salary,
  5. dept_no
  6. FROM
  7. employees AS e
  8. ORDER BY
  9. gross_salary
  10. LIMIT
  11. 10
  12. )
  13. SELECT
  14. table_0.name,
  15. table_0.gross_salary,
  16. d.name
  17. FROM
  18. table_1 AS table_0
  19. JOIN department AS d ON table_0.dept_no = d.dept_no

See also