Common Table Expressions

Peewee supports the inclusion of common table expressions (CTEs) in all types of queries. CTEs may be useful for:

  • Factoring out a common subquery.
  • Grouping or filtering by a column derived in the CTE’s result set.
  • Writing recursive queries.

To declare a Select query for use as a CTE, use cte() method, which wraps the query in a CTE object. To indicate that a CTE should be included as part of a query, use the Query.with_cte() method, passing a list of CTE objects.

Simple Example

For an example, let’s say we have some data points that consist of a key and a floating-point value. Let’s define our model and populate some test data:

  1. class Sample(Model):
  2. key = TextField()
  3. value = FloatField()
  4. data = (
  5. ('a', (1.25, 1.5, 1.75)),
  6. ('b', (2.1, 2.3, 2.5, 2.7, 2.9)),
  7. ('c', (3.5, 3.5)))
  8. # Populate data.
  9. for key, values in data:
  10. Sample.insert_many([(key, value) for value in values],
  11. fields=[Sample.key, Sample.value]).execute()

Let’s use a CTE to calculate, for each distinct key, which values were above-average for that key.

  1. # First we'll declare the query that will be used as a CTE. This query
  2. # simply determines the average value for each key.
  3. cte = (Sample
  4. .select(Sample.key, fn.AVG(Sample.value).alias('avg_value'))
  5. .group_by(Sample.key)
  6. .cte('key_avgs', columns=('key', 'avg_value')))
  7. # Now we'll query the sample table, using our CTE to find rows whose value
  8. # exceeds the average for the given key. We'll calculate how far above the
  9. # average the given sample's value is, as well.
  10. query = (Sample
  11. .select(Sample.key, Sample.value)
  12. .join(cte, on=(Sample.key == cte.c.key))
  13. .where(Sample.value > cte.c.avg_value)
  14. .order_by(Sample.value)
  15. .with_cte(cte))

We can iterate over the samples returned by the query to see which samples had above-average values for their given group:

  1. >>> for sample in query:
  2. ... print(sample.key, sample.value)
  3. # 'a', 1.75
  4. # 'b', 2.7
  5. # 'b', 2.9

Complex Example

For a more complete example, let’s consider the following query which uses multiple CTEs to find per-product sales totals in only the top sales regions. Our model looks like this:

  1. class Order(Model):
  2. region = TextField()
  3. amount = FloatField()
  4. product = TextField()
  5. quantity = IntegerField()

Here is how the query might be written in SQL. This example can be found in the postgresql documentation.

  1. WITH regional_sales AS (
  2. SELECT region, SUM(amount) AS total_sales
  3. FROM orders
  4. GROUP BY region
  5. ), top_regions AS (
  6. SELECT region
  7. FROM regional_sales
  8. WHERE total_sales > (SELECT SUM(total_sales) / 10 FROM regional_sales)
  9. )
  10. SELECT region,
  11. product,
  12. SUM(quantity) AS product_units,
  13. SUM(amount) AS product_sales
  14. FROM orders
  15. WHERE region IN (SELECT region FROM top_regions)
  16. GROUP BY region, product;

With Peewee, we would write:

  1. reg_sales = (Order
  2. .select(Order.region,
  3. fn.SUM(Order.amount).alias('total_sales'))
  4. .group_by(Order.region)
  5. .cte('regional_sales'))
  6. top_regions = (reg_sales
  7. .select(reg_sales.c.region)
  8. .where(reg_sales.c.total_sales > (
  9. reg_sales.select(fn.SUM(reg_sales.c.total_sales) / 10)))
  10. .cte('top_regions'))
  11. query = (Order
  12. .select(Order.region,
  13. Order.product,
  14. fn.SUM(Order.quantity).alias('product_units'),
  15. fn.SUM(Order.amount).alias('product_sales'))
  16. .where(Order.region.in_(top_regions.select(top_regions.c.region)))
  17. .group_by(Order.region, Order.product)
  18. .with_cte(regional_sales, top_regions))

Recursive CTEs

Peewee supports recursive CTEs. Recursive CTEs can be useful when, for example, you have a tree data-structure represented by a parent-link foreign key. Suppose, for example, that we have a hierarchy of categories for an online bookstore. We wish to generate a table showing all categories and their absolute depths, along with the path from the root to the category.

We’ll assume the following model definition, in which each category has a foreign-key to its immediate parent category:

  1. class Category(Model):
  2. name = TextField()
  3. parent = ForeignKeyField('self', backref='children', null=True)

To list all categories along with their depth and parents, we can use a recursive CTE:

  1. # Define the base case of our recursive CTE. This will be categories that
  2. # have a null parent foreign-key.
  3. Base = Category.alias()
  4. level = Value(1).alias('level')
  5. path = Base.name.alias('path')
  6. base_case = (Base
  7. .select(Base.id, Base.name, Base.parent, level, path)
  8. .where(Base.parent.is_null())
  9. .cte('base', recursive=True))
  10. # Define the recursive terms.
  11. RTerm = Category.alias()
  12. rlevel = (base_case.c.level + 1).alias('level')
  13. rpath = base_case.c.path.concat('->').concat(RTerm.name).alias('path')
  14. recursive = (RTerm
  15. .select(RTerm.id, RTerm.name, RTerm.parent, rlevel, rpath)
  16. .join(base_case, on=(RTerm.parent == base_case.c.id)))
  17. # The recursive CTE is created by taking the base case and UNION ALL with
  18. # the recursive term.
  19. cte = base_case.union_all(recursive)
  20. # We will now query from the CTE to get the categories, their levels, and
  21. # their paths.
  22. query = (cte
  23. .select_from(cte.c.name, cte.c.level, cte.c.path)
  24. .order_by(cte.c.path))
  25. # We can now iterate over a list of all categories and print their names,
  26. # absolute levels, and path from root -> category.
  27. for category in query:
  28. print(category.name, category.level, category.path)
  29. # Example output:
  30. # root, 1, root
  31. # p1, 2, root->p1
  32. # c1-1, 3, root->p1->c1-1
  33. # c1-2, 3, root->p1->c1-2
  34. # p2, 2, root->p2
  35. # c2-1, 3, root->p2->c2-1