Subqueries

Peewee allows you to join on any table-like object, including subqueries or common table expressions (CTEs). To demonstrate joining on a subquery, let’s query for all users and their latest tweet.

Here is the SQL:

  1. SELECT tweet.*, user.*
  2. FROM tweet
  3. INNER JOIN (
  4. SELECT latest.user_id, MAX(latest.timestamp) AS max_ts
  5. FROM tweet AS latest
  6. GROUP BY latest.user_id) AS latest_query
  7. ON ((tweet.user_id = latest_query.user_id) AND (tweet.timestamp = latest_query.max_ts))
  8. INNER JOIN user ON (tweet.user_id = user.id)

We’ll do this by creating a subquery which selects each user and the timestamp of their latest tweet. Then we can query the tweets table in the outer query and join on the user and timestamp combination from the subquery.

  1. # Define our subquery first. We'll use an alias of the Tweet model, since
  2. # we will be querying from the Tweet model directly in the outer query.
  3. Latest = Tweet.alias()
  4. latest_query = (Latest
  5. .select(Latest.user, fn.MAX(Latest.timestamp).alias('max_ts'))
  6. .group_by(Latest.user)
  7. .alias('latest_query'))
  8. # Our join predicate will ensure that we match tweets based on their
  9. # timestamp *and* user_id.
  10. predicate = ((Tweet.user == latest_query.c.user_id) &
  11. (Tweet.timestamp == latest_query.c.max_ts))
  12. # We put it all together, querying from tweet and joining on the subquery
  13. # using the above predicate.
  14. query = (Tweet
  15. .select(Tweet, User) # Select all columns from tweet and user.
  16. .join(latest_query, on=predicate) # Join tweet -> subquery.
  17. .join_from(Tweet, User)) # Join from tweet -> user.

Iterating over the query, we can see each user and their latest tweet.

  1. >>> for tweet in query:
  2. ... print(tweet.user.username, '->', tweet.content)
  3. ...
  4. huey -> purr
  5. mickey -> whine

There are a couple things you may not have seen before in the code we used to create the query in this section:

  • We used join_from() to explicitly specify the join context. We wrote .join_from(Tweet, User), which is equivalent to .switch(Tweet).join(User).
  • We referenced columns in the subquery using the magic .c attribute, for example latest_query.c.max_ts. The .c attribute is used to dynamically create column references.
  • Instead of passing individual fields to Tweet.select(), we passed the Tweet and User models. This is shorthand for selecting all fields on the given model.

Common-table Expressions

In the previous section we joined on a subquery, but we could just as easily have used a common-table expression (CTE). We will repeat the same query as before, listing users and their latest tweets, but this time we will do it using a CTE.

Here is the SQL:

  1. WITH latest AS (
  2. SELECT user_id, MAX(timestamp) AS max_ts
  3. FROM tweet
  4. GROUP BY user_id)
  5. SELECT tweet.*, user.*
  6. FROM tweet
  7. INNER JOIN latest
  8. ON ((latest.user_id = tweet.user_id) AND (latest.max_ts = tweet.timestamp))
  9. INNER JOIN user
  10. ON (tweet.user_id = user.id)

This example looks very similar to the previous example with the subquery:

  1. # Define our CTE first. We'll use an alias of the Tweet model, since
  2. # we will be querying from the Tweet model directly in the main query.
  3. Latest = Tweet.alias()
  4. cte = (Latest
  5. .select(Latest.user, fn.MAX(Latest.timestamp).alias('max_ts'))
  6. .group_by(Latest.user)
  7. .cte('latest'))
  8. # Our join predicate will ensure that we match tweets based on their
  9. # timestamp *and* user_id.
  10. predicate = ((Tweet.user == cte.c.user_id) &
  11. (Tweet.timestamp == cte.c.max_ts))
  12. # We put it all together, querying from tweet and joining on the CTE
  13. # using the above predicate.
  14. query = (Tweet
  15. .select(Tweet, User) # Select all columns from tweet and user.
  16. .join(cte, on=predicate) # Join tweet -> CTE.
  17. .join_from(Tweet, User) # Join from tweet -> user.
  18. .with_cte(cte))

We can iterate over the result-set, which consists of the latest tweets for each user:

  1. >>> for tweet in query:
  2. ... print(tweet.user.username, '->', tweet.content)
  3. ...
  4. huey -> purr
  5. mickey -> whine

Note

For more information about using CTEs, including information on writing recursive CTEs, see the Common Table Expressions section of the “Querying” document.