Introduction

PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement. Like SQL, it’s readable, explicit and declarative. Unlike SQL, it forms a logical pipeline of transformations, and supports abstractions such as variables and functions. It can be used with any database that uses SQL, since it transpiles to SQL.

Let’s get started with an example:

PRQL

  1. from employees
  2. filter start_date > @2021-01-01 # Clear date syntax
  3. derive [ # `derive` adds columns / variables
  4. gross_salary = salary + (tax ?? 0), # Terse coalesce
  5. gross_cost = gross_salary + benefits_cost, # Variables can use other variables
  6. ]
  7. filter gross_cost > 0
  8. group [title, country] ( # `group` runs a pipeline over each group
  9. aggregate [ # `aggregate` reduces each group to a value
  10. average gross_salary,
  11. sum_gross_cost = sum gross_cost, # `=` sets a column name
  12. ]
  13. )
  14. filter sum_gross_cost > 100_000 # `filter` replaces both of SQL's `WHERE` & `HAVING`
  15. derive id = f"{title}_{country}" # F-strings like Python
  16. derive country_code = s"LEFT(country, 2)" # S-strings allow using SQL as an escape hatch
  17. sort [sum_gross_cost, -country] # `-country` means descending order
  18. take 1..20 # Range expressions (also valid here as `take 20`)

SQL

  1. WITH table_1 AS (
  2. SELECT
  3. title,
  4. country,
  5. salary + COALESCE(tax, 0) + benefits_cost AS _expr_0,
  6. salary + COALESCE(tax, 0) AS _expr_1
  7. FROM
  8. employees
  9. WHERE
  10. start_date > DATE '2021-01-01'
  11. )
  12. SELECT
  13. title,
  14. country,
  15. AVG(_expr_1),
  16. SUM(_expr_0) AS sum_gross_cost,
  17. CONCAT(title, '_', country) AS id,
  18. LEFT(country, 2) AS country_code
  19. FROM
  20. table_1 AS table_0
  21. WHERE
  22. _expr_0 > 0
  23. GROUP BY
  24. title,
  25. country
  26. HAVING
  27. SUM(_expr_0) > 100000
  28. ORDER BY
  29. sum_gross_cost,
  30. country DESC
  31. LIMIT
  32. 20

As you can see, PRQL is a linear pipeline of transformations — each line of the query is a transformation of the previous line’s result.

You can see that in SQL, operations do not follow one another, which makes it hard to compose larger queries.