Entity inheritance

Entity inheritance in Pony is similar to inheritance for regular Python classes. Let’s consider an example of a data diagram where entities Student and Professor inherit from the entity Person:

  1. class Person(db.Entity):
  2. name = Required(str)
  3. class Student(Person):
  4. gpa = Optional(Decimal)
  5. mentor = Optional("Professor")
  6. class Professor(Person):
  7. degree = Required(str)
  8. students = Set("Student")

All attributes and relationships of the base entity Person are inherited by all descendants.

In some mappers (e.g. Django) a query on a base entity doesn’t return the right class: for derived entities the query returns just a base part of each instance. With Pony you always get the correct entity instances:

  1. for p in Person.select():
  2. if isinstance(p, Professor):
  3. print p.name, p.degree
  4. elif isinstance(p, Student):
  5. print p.name, p.gpa
  6. else: # somebody else
  7. print p.name

Note

Since version 0.7.7 you can use isinstance() inside query

  1. staff = select(p for p in Person if not isinstance(p, Student))

In order to create the correct entity instance Pony uses a discriminator column. By default this is a string column and Pony uses it to store the entity class name:

  1. classtype = Discriminator(str)

By default Pony implicitly creates the classtype attribute for each entity class which takes part in inheritance. You can use your own discriminator column name and type. If you change the type of the discriminator column, then you have to specify the _discriminator_ value for each entity.

Let’s consider the example above and use cls_id as the name for our discriminator column of int type:

  1. class Person(db.Entity):
  2. cls_id = Discriminator(int)
  3. _discriminator_ = 1
  4. ...
  5. class Student(Person):
  6. _discriminator_ = 2
  7. ...
  8. class Professor(Person):
  9. _discriminator_ = 3
  10. ...

Multiple inheritance

Pony also supports multiple inheritance. If you use multiple inheritance then all the parent classes of the newly defined class should inherit from the same base class (a “diamond-like” hierarchy).

Let’s consider an example where a student can have a role of a teaching assistant. For this purpose we’ll introduce the entity Teacher and derive Professor and TeachingAssistant from it. The entity TeachingAssistant inherits from both the Student class and the Teacher class:

  1. class Person(db.Entity):
  2. name = Required(str)
  3. class Student(Person):
  4. ...
  5. class Teacher(Person):
  6. ...
  7. class Professor(Teacher):
  8. ...
  9. class TeachingAssistant(Student, Teacher):
  10. ...

The TeachingAssistant objects are instances of both Teacher and Student entities and inherit all their attributes. Multiple inheritance is possible here because both Teacher and Student have the same base class Person.

Inheritance is a very powerful tool, but it should be used wisely. Often the data diagram is much simpler if it has limited usage of inheritance.

Representing inheritance in the database

There are three ways to implement inheritance in the database:

  1. Single Table Inheritance: all entities in the hierarchy are mapped to a single database table.

  2. Class Table Inheritance: each entity in the hierarchy is mapped to a separate table, but each table stores only the attributes which the entity doesn’t inherit from its parents.

  3. Concrete Table Inheritance: each entity in the hierarchy is mapped to a separate table and each table stores the attributes of the entity and all its ancestors.

The main problem of the third approach is that there is no single table where we can store the primary key and that is why this implementation is rarely used.

The second implementation is used often, this is how the inheritance is implemented in Django. The disadvantage of this approach is that the mapper has to join several tables together in order to retrieve data which can lead to the performance degradation.

Pony uses the first approach where all entities in the hierarchy are mapped to a single database table. This is the most efficient implementation because there is no need to join tables. This approach has its disadvantages too:

  • Each table row has columns which are not used because they belong to other entities in the hierarchy. It is not a big problem because the blank columns keep NULL values and it doesn’t use much space.

  • The table can have large number of columns if there are a lot of entities in the hierarchy. Different databases have different limits for maximum columns per table, but usually that limit is pretty high.