Joining multiple tables

Let’s take another look at joins by querying the list of users and getting thecount of how many tweet’s they’ve authored that were favorited. This willrequire us to join twice: from user to tweet, and from tweet to favorite. We’lladd the additional requirement that users should be included who have notcreated any tweets, as well as users whose tweets have not been favorited. Thequery, expressed in SQL, would be:

  1. SELECT user.username, COUNT(favorite.id)
  2. FROM user
  3. LEFT OUTER JOIN tweet ON tweet.user_id = user.id
  4. LEFT OUTER JOIN favorite ON favorite.tweet_id = tweet.id
  5. GROUP BY user.username

Note

In the above query both joins are LEFT OUTER, since a user may not have anytweets or, if they have tweets, none of them may have been favorited.

Peewee has a concept of a join context, meaning that whenever we call thejoin() method, we are implicitly joining on thepreviously-joined model (or if this is the first call, the model we areselecting from). Since we are joining straight through, from user to tweet,then from tweet to favorite, we can simply write:

  1. query = (User
  2. .select(User.username, fn.COUNT(Favorite.id).alias('count'))
  3. .join(Tweet, JOIN.LEFT_OUTER) # Joins user -> tweet.
  4. .join(Favorite, JOIN.LEFT_OUTER) # Joins tweet -> favorite.
  5. .group_by(User.username))

Iterating over the results:

  1. >>> for user in query:
  2. ... print(user.username, user.count)
  3. ...
  4. huey 3
  5. mickey 1
  6. zaizee 0

For a more complicated example involving multiple joins and switching joincontexts, let’s find all the tweets by Huey and the number of times they’vebeen favorited. To do this we’ll need to perform two joins and we’ll also usean aggregate function to calculate the favorite count.

Here is how we would write this query in SQL:

  1. SELECT tweet.content, COUNT(favorite.id)
  2. FROM tweet
  3. INNER JOIN user ON tweet.user_id = user.id
  4. LEFT OUTER JOIN favorite ON favorite.tweet_id = tweet.id
  5. WHERE user.username = 'huey'
  6. GROUP BY tweet.content;

Note

We use a LEFT OUTER join from tweet to favorite since a tweet may not haveany favorites, yet we still wish to display it’s content (along with acount of zero) in the result set.

With Peewee, the resulting Python code looks very similar to what we wouldwrite in SQL:

  1. query = (Tweet
  2. .select(Tweet.content, fn.COUNT(Favorite.id).alias('count'))
  3. .join(User) # Join from tweet -> user.
  4. .switch(Tweet) # Move "join context" back to tweet.
  5. .join(Favorite, JOIN.LEFT_OUTER) # Join from tweet -> favorite.
  6. .where(User.username == 'huey')
  7. .group_by(Tweet.content))

Note the call to switch() - that instructs Peewee to setthe join context back to Tweet. If we had omitted the explicit call toswitch, Peewee would have used User (the last model we joined) as the joincontext and constructed the join from User to Favorite using theFavorite.user foreign-key, which would have given us incorrect results.

If we wanted to omit the join-context switching we could instead use thejoin_from() method. The following query is equivalent tothe previous one:

  1. query = (Tweet
  2. .select(Tweet.content, fn.COUNT(Favorite.id).alias('count'))
  3. .join_from(Tweet, User) # Join tweet -> user.
  4. .join_from(Tweet, Favorite, JOIN.LEFT_OUTER) # Join tweet -> favorite.
  5. .where(User.username == 'huey')
  6. .group_by(Tweet.content))

We can iterate over the results of the above query to print the tweet’s contentand the favorite count:

  1. >>> for tweet in query:
  2. ... print('%s favorited %d times' % (tweet.content, tweet.count))
  3. ...
  4. meow favorited 1 times
  5. hiss favorited 0 times
  6. purr favorited 2 times