Entity definition

An entity is a Python class which stores an object’s state in the database. Each instance of an entity corresponds to a row in the database table. Often entities represent objects from the real world (e.g. Customer, Product).

Entity attributes

Entity attributes are specified as class attributes inside the entity class using the syntax:

  1. class EntityName(inherits_from)
  2. attr_name = attr_kind(attr_type, attr_options)

For example:

  1. class Person(db.Entity):
  2. id = PrimaryKey(int, auto=True)
  3. name = Required(str)
  4. age = Optional(int)

Attribute kinds

Each entity attribute can be one of the following kinds:

  • Required - must have a value at all times

  • Optional - the value is optional

  • PrimaryKey - defines a primary key attribute

  • Set - represents a collection, used for ‘to-many’ relationships

  • Discriminator - used for entity inheritance

Optional string attributes

For most data types None is used when no value is assigned to the attribute. But when a string attribute is not assigned a value, Pony uses an empty string instead of None. This is more practical than storing empty string as NULL in the database. Most frameworks behave this way. Also, empty strings can be indexed for faster search, unlike NULLs. If you will try to assign None to such an optional string attribute, you’ll get the ConstraintError exception.

You can change this behavior using the nullable=True option. In this case it will be possible to store both empty strings and NULL values in the same column, but this is rarely needed.

Oracle database treats empty strings as NULL values. Because of this all Optional attributes in Oracle have nullable set to True automatically.

If an optional string attribute is used as a unique key or as a part of a unique composite key, it will always have nullable set to True automatically.

Composite primary and secondary keys

Pony fully supports composite keys. In order to declare a composite primary key you need to specify all the parts of the key as Required and then combine them into a composite primary key:

  1. class Example(db.Entity):
  2. a = Required(int)
  3. b = Required(str)
  4. PrimaryKey(a, b)

In order to declare a secondary composite key you need to declare attributes as usual and then combine them using the composite_key directive:

  1. class Example(db.Entity):
  2. a = Required(str)
  3. b = Optional(int)
  4. composite_key(a, b)

In the database composite_key(a, b) will be represented as the UNIQUE ("a", "b") constraint.

Composite indexes

Using the composite_index() directive you can create a composite index for speeding up data retrieval. It can combine two or more attributes:

  1. class Example(db.Entity):
  2. a = Required(str)
  3. b = Optional(int)
  4. composite_index(a, b)

The composite index can include a discriminator attribute used for inheritance.

Using the composite_index() you can create a non-unique index. In order to define an unique index, use the composite_key() function described above.

Attribute data types

Pony supports the following attribute types:

  • str

  • unicode

  • int

  • float

  • Decimal

  • datetime

  • date

  • time

  • timedelta

  • bool

  • buffer - used for binary data in Python 2 and 3

  • bytes - used for binary data in Python 3

  • LongStr - used for large strings

  • LongUnicode - used for large strings

  • UUID

  • Json - used for mapping to native database JSON type

  • IntArray

  • StrArray

  • FloatArray

Note

IntArray, StrArray and FloatArray types are supported only in PostgreSQL and SQLite.

Also you can specify another entity as the attribute type for defining a relationship between two entities.

Strings in Python 2 and 3

As you know, Python 3 has some differences from Python 2 when it comes to strings. Python 2 provides two string types – str (byte string) and unicode (unicode string), whereas in Python 3 the str type represents unicode strings and the unicode was just removed.

Before the release 0.6, Pony stored str and unicode attributes as unicode in the database, but for str attributes it had to convert unicode to byte string on reading from the database. Starting with the Pony Release 0.6 the attributes of str type in Python 2 behave as if they were declared as unicode attributes. There is no difference now if you specify str or unicode as the attribute type – you will have unicode string in Python and in the database.

Starting with the Pony Release 0.6, where the support for Python 3 was added, instead of unicode and LongUnicode we recommend to use str and LongStr types respectively. LongStr and LongUnicode are stored as CLOB in the database.

The same thing is with the LongUnicode and LongStr. LongStr now is an alias to LongUnicode. This type uses unicode in Python and in the database.

  1. attr1 = Required(str)
  2. # is the same as
  3. attr2 = Required(unicode)
  4. attr3 = Required(LongStr)
  5. # is the same as
  6. attr4 = Required(LongUnicode)

Buffer and bytes types in Python 2 and 3

If you need to represent byte sequence in Python 2, you can use the buffer type. In Python 3 you should use the bytes type for this purpose. buffer and bytes types are stored as binary (BLOB) types in the database.

In Python 3 the buffer type has gone, and Pony uses the bytes type which was added in Python 3 to represent binary data. But for the sake of backward compatibility we still keep buffer as an alias to the bytes type in Python 3. If you’re importing * from pony.orm you will get this alias too.

If you want to write code which can run both on Python 2 and Python 3, you should use the buffer type for binary attributes. If your code is for Python 3 only, you can use bytes instead:

  1. attr1 = Required(buffer) # Python 2 and 3
  2. attr2 = Required(bytes) # Python 3 only

It would be cool if we could use the bytes type as an alias to buffer in Python 2, but unfortunately it is impossible, because Python 2.6 adds bytes as a synonym for the str type.

Attribute options

Attribute options can be specified as positional and as keyword arguments during an attribute definition.

Max string length

String types can accept a positional argument which specifies the max length of this column in the database:

  1. class Person(db.Entity):
  2. name = Required(str, 40) # VARCHAR(40)

Also you can use the max_len option:

  1. class Person(db.Entity):
  2. name = Required(str, max_len=40) # VARCHAR(40)

Decimal scale and precision

For the Decimal type you can specify precision and scale:

  1. class Product(db.Entity):
  2. price = Required(Decimal, 10, 2) # DECIMAL(10, 2)

Also you can use precision and scale options:

  1. class Product(db.Entity):
  2. price = Required(Decimal, precision=10, scale=2) # DECIMAL(10, 2)

If you don’t specify the precision and scale parameters, Pony uses Decimal(precision=12, scale=2) values by default.

Datetime, time and timedelta precision

The datetime and time types accept a positional argument which specifies the column’s precision. By default it is equal to 6 for most databases.

For MySQL database the default value is 0. Before the MySQL version 5.6.4, the DATETIME and TIME columns were unable to store fractional seconds at all. Starting with the version 5.6.4, you can store fractional seconds if you set the precision equal to 6 during the attribute definition:

  1. class Action(db.Entity):
  2. dt = Required(datetime, 6)

The same, using the precision option:

  1. class Action(db.Entity):
  2. dt = Required(datetime, precision=6)

Keyword argument options

Additional attribute options can be set as keyword arguments. For example:

  1. class Customer(db.Entity):
  2. email = Required(str, unique=True)

Below you can find the list of available options:

auto

(bool) Can be used for a PrimaryKey attribute only. If auto=True then the value for this attribute will be assigned automatically using the database’s incremental counter or sequence.

autostrip

(bool) Automatically removes leading and trailing whitespace characters in a string attribute. Similar to Python string.strip() function. By default is True.

cascade_delete

(bool) Controls the cascade deletion of related objects. True means that Pony always does cascade delete even if the other side is defined as Optional. False means that Pony never does cascade delete for this relationship. If the relationship is defined as Required at the other end and cascade_delete=False then Pony raises the ConstraintError exception on deletion attempt. See also.

column

(str) Specifies the name of the column in the database table which is used for mapping. By default Pony uses the attribute name as the column name in the database.

columns

(list) Specifies the column names in the database table which are used for mapping a composite attribute.

default

(numeric|str|function) Allows specifying a default value for the attribute. Pony processes default values in Python, it doesn’t add SQL DEFAULT clause to the column definition. This is because the default expression can be not only a constant, but any arbitrary Python function. For example:

  1. import uuid
  2. from pony.orm import *
  3. db = Database()
  4. class MyEntity(db.Entity):
  5. code = Required(uuid.UUID, default=uuid.uuid4)

If you need to set a default value in the database, you should use the sql_default option.

fk_name

(str) Applies for Required and Optional relationship attributes, allows to specify the name of the foreign key in the database.

index

(bool|str) Allows to control index creation for this column. index=True - the index will be created with the default name. index='index_name' - create index with the specified name. index=False – skip index creation. If no ‘index’ option is specified then Pony still creates index for foreign keys using the default name.

lazy

(bool) When True, then Pony defers loading the attribute value when loading the object. The value will not be loaded until you try to access this attribute directly. By default lazy is set to True for LongStr and LongUnicode and to False for all other types.

max

(numeric) Allows specifying the maximum allowed value for numeric attributes (int, float, Decimal). If you will try to assign the value that is greater than the specified max value, you’ll get the ValueError exception.

max_len

(int) Sets the maximum length for string attributes.

min

(numeric) Allows specifying the minimum allowed value for numeric attributes (int, float, Decimal). If you will try to assign the value that is less than the specified min value, you’ll get the ValueError exception.

nplus1_threshold

(int) This parameter is used for fine tuning the threshold used for the N+1 problem solution.

nullable

(bool) True allows the column to be NULL in the database. Most likely you don’t need to specify this option because Pony sets it to the most appropriate value by default.

optimistic

(bool) True means this attribute will be used for automatic optimistic checks, see Optimistic concurrency control section. By default, this option is set to True for all attributes except attributes of float type - for float type attributes it is set to False by default.

See also volatile option.

precision

(int) Sets the precision for Decimal, time, timedelta, datetime attribute types.

py_check

(function) Allows to specify a function which will be used for checking the value before it is assigned to the attribute. The function should return True or False. Also it can raise the ValueError exception if the check failed.

  1. class Student(db.Entity):
  2. name = Required(str)
  3. gpa = Required(float, py_check=lambda val: val >= 0 and val <= 5)

reverse

(str) Specifies the attribute name at the other end which should be used for the relationship. It might be needed if there are more than one relationship between two entities.

reverse_column

(str) Used for a symmetric relationship in order to specify the name of the database column for the intermediate table.

reverse_columns

(list) Used for a symmetric relationship if the entity has a composite primary key. Allows you to specify the name of the database columns for the intermediate table.

scale

(int) Sets the scale for Decimal attribute types.

size

(int) For the int type you can specify the size of integer type that should be used in the database using the size keyword. This parameter receives the number of bits that should be used for representing an integer in the database. Allowed values are 8, 16, 24, 32 and 64:

  1. attr1 = Required(int, size=8) # 8 bit - TINYINT in MySQL
  2. attr2 = Required(int, size=16) # 16 bit - SMALLINT in MySQL
  3. attr3 = Required(int, size=24) # 24 bit - MEDIUMINT in MySQL
  4. attr4 = Required(int, size=32) # 32 bit - INTEGER in MySQL
  5. attr5 = Required(int, size=64) # 64 bit - BIGINT in MySQL

You can use the unsigned parameter to specify that the attribute is unsigned:

  1. attr1 = Required(int, size=8, unsigned=True) # TINYINT UNSIGNED in MySQL

The default value of the unsigned parameter is False. If unsigned is set to True, but size is not provided, size assumed to be 32 bits.

If current database does not support specified attribute size, the next bigger size is used. For example, PostgreSQL does not have MEDIUMINT numeric type, so INTEGER type will be used for an attribute with size 24.

Only MySQL actually supports unsigned types. For other databases the column will use signed numeric type which can hold all valid values for the specified unsigned type. For example, in PostgreSQL an unsigned attribute with size 16 will use INTEGER type. An unsigned attribute with size 64 can be represented only in MySQL and Oracle.

When the size is specified, Pony automatically assigns min and max values for this attribute. For example, a signed attribute with size 8 will receive min value -128 and max value 127, while unsigned attribute with the same size will receive min value 0 and max value 255. You can override min and max with your own values if necessary, but these values should not exceed the range implied by the size.

Starting with the Pony release 0.6 the long type is deprecated and if you want to store 64 bit integers in the database, you need to use int instead with size=64. If you don’t specify the size parameter, Pony will use the default integer type for the specific database.

sequence_name

(str) Allows to specify the sequence name used for PrimaryKey attributes. Oracle database only.

sql_default

(str) This option allows specifying the default SQL text which will be included to the CREATE TABLE SQL command. For example:

  1. class MyEntity(db.Entity):
  2. created_at = Required(datetime, sql_default='CURRENT_TIMESTAMP')
  3. closed = Required(bool, default=True, sql_default='1')

Specifying sql_default=True can be convenient when you have a Required attribute and the value for it is going to be calculated in the database during the INSERT command (e.g. by a trigger). None by default.

sql_type

(str) Sets a specific SQL type for the column.

unique

(bool) If True, then the database will check that the value of this attribute is unique.

unsigned

(bool) Allows creating unsigned types in the database. Also checks that the assigned value is positive.

table

(str) Used for many-to-many relationship only in order to specify the name of the intermediate table.

volatile

(bool) Usually you specify the value of the attribute in Python and Pony stores this value in the database. But sometimes you might want to have some logic in the database which changes the value for a column. For example, you can have a trigger in the database which updates the timestamp of the last object’s modification. In this case you want to have Pony to forget the value of the attribute on object’s update sent to the database and read it from the database at the next access attempt. Set volatile=True in order to let Pony know that this attribute can be changed in the database.

The volatile=True option can be combined with the sql_default option if the value for this attribute is going to be both created and updated by the database.

You can get the exception UnrepeatableReadError: Value ... was updated outside of current transaction if another transaction changes the value of the attribute which is used in the current transaction. Pony notifies about it because this situation can break the business logic of the application. If you don’t want Pony to protect you from such concurrent modifications you can set volatile=True for the attribute. This will turn the optimistic concurrency control off.

See also optimistic option.

Collection attribute methods

To-many attributes have methods that provide a convenient way of querying data. You can treat a to-many relationship attribute as a regular Python collection and use standard operations like in, not in, len. Also Pony provides the following methods:

class Set

  • __len__()

    Return the number of objects in the collection. If the collection is not loaded into cache, this methods loads all the collection instances into the cache first, and then returns the number of objects. Use this method if you are going to iterate over the objects and you need them loaded into the cache. If you don’t need the collection to be loaded into the memory, you can use the count() method.

    1. >>> p1 = Person[1]
    2. >>> Car[1] in p1.cars
    3. True
    4. >>> len(p1.cars)
    5. 2
  • add(item|iter)

    Add instances to a collection and establish a two-way relationship between entity instances:

    1. photo = Photo[123]
    2. photo.tags.add(Tag['Outdoors'])

    Now the instance of the Photo entity with the primary key 123 has a relationship with the Tag['Outdoors'] instance. The attribute photos of the Tag['Outdoors'] instance contains the reference to the Photo[123] as well.

    You can also establish several relationships at once passing the list of tags to the add() method:

    1. photo.tags.add([Tag['Party'], Tag['New Year']])
  • clear()

    Remove all items from the collection which means breaking relationships between entity instances.

  • copy()

    Return a Python set object which contains the same items as the given collection.

  • count(distinct=False)

    Return the number of objects in the collection. This method doesn’t load the collection instances into the cache, but generates an SQL query which returns the number of objects from the database. If you are going to work with the collection objects (iterate over the collection or change the object attributes), you might want to use the __len__() method.

  • create(\*kwargs*)

    Create an return an instance of the related entity and establishes a relationship with it:

    1. new_tag = Photo[123].tags.create(name='New tag')

    is an equivalent of the following:

    1. new_tag = Tag(name='New tag')
    2. Photo[123].tags.add(new_tag)
  • drop_table(with_all_data=False)

    Drop the intermediate table which is created for establishing many-to-many relationship. If the table is not empty and with_all_data=False, the method raises the TableIsNotEmpty exception and doesn’t delete anything. Setting the with_all_data=True allows you to delete the table even if it is not empty.

    1. class Product(db.Entity):
    2. tags = Set('Tag')
    3. class Tag(db.Entity):
    4. products = Set(Product)
    5. Product.tags.drop_table(with_all_data=True) # removes the intermediate table
  • is_empty()

    Check if the collection is empty. Returns False if there is at least one relationship and True if this attribute has no relationships.

    1. select(g for g in Group if not g.students.is_empty())
  • filter()

    Select objects from a collection. The method names select() and filter() are synonyms. Example:

    1. g = Group[101]
    2. g.students.filter(lambda student: student.gpa > 3)
  • load()

    Load all related objects from the database.

  • order_by(attr|lambda)

    Return an ordered collection.

    1. g.students.order_by(Student.name).page(2, pagesize=3)
    2. g.students.order_by(lambda s: s.name).limit(3, offset=3)
  • sort_by(attr|lambda)

    Return an ordered collection. For a collection, the sort_by method works the same way as order_by().

    1. g.students.sort_by(Student.name).page(2, pagesize=3)
    2. g.students.sort_by(lambda s: s.name).limit(3, offset=3)
  • page(pagenum, pagesize=10)

    This query can be used for displaying the second page of group 101 student’s list ordered by the name attribute:

    1. g.students.order_by(Student.name).page(2, pagesize=3)
    2. g.students.order_by(lambda s: s.name).limit(3, offset=3)
  • random(limit)

    Return a number of random objects from a collection.

    1. g = Group[101]
    2. g.students.random(2)
  • remove(item|iter)

    Remove an item or items from the collection and thus break the relationship between entity instances.

  • select()

    Select objects from a collection. The method names select() and filter() are synonyms. Example:

    1. g = Group[101]
    2. g.students.select(lambda student: student.gpa > 3)

Entity options

composite_index(attrs)

Combine an index from multiple attributes. Link.

composite_key(attrs)

Combine a secondary key from multiple attributes. Link.

_discriminator_

Specify the discriminator value for an entity. See more information in the Entity inheritance section.

PrimaryKey(attrs)

Combine a primary key from multiple attributes. Link.

_table_

Specify the name of mapped table in the database. See more information in the Mapping customization section.

_table_options_

All parameters specified here will be added as plain text at the end of the CREATE TABLE command. Example:

  1. class MyEntity(db.Entity):
  2. id = PrimaryKey(int)
  3. foo = Required(str)
  4. bar = Optional(int)
  5. _table_options_ = {
  6. 'ENGINE': 'InnoDB',
  7. 'TABLESPACE': 'my_tablespace',
  8. 'ENCRYPTION': "'N'",
  9. 'AUTO_INCREMENT': 10
  10. }

Entity hooks

Sometimes you might need to perform an action before or after your entity instance is going to be created, updated or deleted in the database. For this purpose you can use entity hooks.

Here is the list of available hooks:

after_delete()

Called after the entity instance is deleted in the database.

after_insert()

Called after the row is inserted into the database.

after_update()

Called after the instance updated in the database.

before_delete()

Called before deletion the entity instance in the database.

before_insert()

Called only for newly created objects before it is inserted into the database.

before_update()

Called for entity instances before updating the instance in the database.

In order to use a hook, you need to define an entity method with the hook name:

  1. class Message(db.Entity):
  2. title = Required(str)
  3. content = Required(str)
  4. def before_insert(self):
  5. print("Before insert! title=%s" % self.title)

Each hook method receives the instance of the object to be modified. You can check how it works in the interactive mode:

  1. >>> m = Message(title='First message', content='Hello, world!')
  2. >>> commit()
  3. Before insert! title=First message
  4. INSERT INTO "Message" ("title", "content") VALUES (?, ?)
  5. [u'First message', u'Hello, world!']