Writing custom model fields

Introduction

The model reference documentation explains how to useDjango’s standard field classes – CharField,DateField, etc. For many purposes, those classes areall you’ll need. Sometimes, though, the Django version won’t meet your preciserequirements, or you’ll want to use a field that is entirely different fromthose shipped with Django.

Django’s built-in field types don’t cover every possible database column type –only the common types, such as VARCHAR and INTEGER. For more obscurecolumn types, such as geographic polygons or even user-created types such asPostgreSQL custom types, you can define your own Django Field subclasses.

Alternatively, you may have a complex Python object that can somehow beserialized to fit into a standard database column type. This is another casewhere a Field subclass will help you use your object with your models.

Our example object

Creating custom fields requires a bit of attention to detail. To make thingseasier to follow, we’ll use a consistent example throughout this document:wrapping a Python object representing the deal of cards in a hand of Bridge.Don’t worry, you don’t have to know how to play Bridge to follow this example.You only need to know that 52 cards are dealt out equally to four players, whoare traditionally called north, east, south and west. Our class lookssomething like this:

  1. class Hand:
  2. """A hand of cards (bridge style)"""
  3.  
  4. def __init__(self, north, east, south, west):
  5. # Input parameters are lists of cards ('Ah', '9s', etc.)
  6. self.north = north
  7. self.east = east
  8. self.south = south
  9. self.west = west
  10.  
  11. # ... (other possibly useful methods omitted) ...

This is an ordinary Python class, with nothing Django-specific about it.We’d like to be able to do things like this in our models (we assume thehand attribute on the model is an instance of Hand):

  1. example = MyModel.objects.get(pk=1)
  2. print(example.hand.north)
  3.  
  4. new_hand = Hand(north, east, south, west)
  5. example.hand = new_hand
  6. example.save()

We assign to and retrieve from the hand attribute in our model just likeany other Python class. The trick is to tell Django how to handle saving andloading such an object.

In order to use the Hand class in our models, we do not have to changethis class at all. This is ideal, because it means you can easily writemodel support for existing classes where you cannot change the source code.

Note

You might only be wanting to take advantage of custom database columntypes and deal with the data as standard Python types in your models;strings, or floats, for example. This case is similar to our Handexample and we’ll note any differences as we go along.

Background theory

Database storage

Let’s start with model fields. If you break it down, a model field provides away to take a normal Python object – string, boolean, datetime, orsomething more complex like Hand – and convert it to and from a formatthat is useful when dealing with the database. (Such a format is also usefulfor serialization, but as we’ll see later, that is easier once you have thedatabase side under control).

Fields in a model must somehow be converted to fit into an existing databasecolumn type. Different databases provide different sets of valid column types,but the rule is still the same: those are the only types you have to workwith. Anything you want to store in the database must fit into one ofthose types.

Normally, you’re either writing a Django field to match a particular databasecolumn type, or you will need a way to convert your data to, say, a string.

For our Hand example, we could convert the card data to a string of 104characters by concatenating all the cards together in a pre-determined order –say, all the north cards first, then the east, south and west cards. SoHand objects can be saved to text or character columns in the database.

What does a field class do?

All of Django’s fields (and when we say fields in this document, we alwaysmean model fields and not form fields) are subclassesof django.db.models.Field. Most of the information that Django recordsabout a field is common to all fields – name, help text, uniqueness and soforth. Storing all that information is handled by Field. We’ll get into theprecise details of what Field can do later on; for now, suffice it to saythat everything descends from Field and then customizes key pieces of theclass behavior.

It’s important to realize that a Django field class is not what is stored inyour model attributes. The model attributes contain normal Python objects. Thefield classes you define in a model are actually stored in the Meta classwhen the model class is created (the precise details of how this is done areunimportant here). This is because the field classes aren’t necessary whenyou’re just creating and modifying attributes. Instead, they provide themachinery for converting between the attribute value and what is stored in thedatabase or sent to the serializer.

Keep this in mind when creating your own custom fields. The Django Fieldsubclass you write provides the machinery for converting between your Pythoninstances and the database/serializer values in various ways (there aredifferences between storing a value and using a value for lookups, forexample). If this sounds a bit tricky, don’t worry – it will become clearer inthe examples below. Just remember that you will often end up creating twoclasses when you want a custom field:

  • The first class is the Python object that your users will manipulate.They will assign it to the model attribute, they will read from it fordisplaying purposes, things like that. This is the Hand class in ourexample.
  • The second class is the Field subclass. This is the class that knowshow to convert your first class back and forth between its permanentstorage form and the Python form.

Writing a field subclass

When planning your Field subclass, first give somethought to which existing Field class your new fieldis most similar to. Can you subclass an existing Django field and save yourselfsome work? If not, you should subclass the Fieldclass, from which everything is descended.

Initializing your new field is a matter of separating out any arguments that arespecific to your case from the common arguments and passing the latter to theinit() method of Field (or your parentclass).

In our example, we’ll call our field HandField. (It’s a good idea to callyour Field subclass <Something>Field, so it’seasily identifiable as a Field subclass.) It doesn’tbehave like any existing field, so we’ll subclass directly fromField:

  1. from django.db import models
  2.  
  3. class HandField(models.Field):
  4.  
  5. description = "A hand of cards (bridge style)"
  6.  
  7. def __init__(self, *args, **kwargs):
  8. kwargs['max_length'] = 104
  9. super().__init__(*args, **kwargs)

Our HandField accepts most of the standard field options (see the listbelow), but we ensure it has a fixed length, since it only needs to hold 52card values plus their suits; 104 characters in total.

Note

Many of Django’s model fields accept options that they don’t do anythingwith. For example, you can pass botheditable andauto_now to adjango.db.models.DateField and it will ignore theeditable parameter(auto_now being set implieseditable=False). No error is raised in this case.

This behavior simplifies the field classes, because they don’t need tocheck for options that aren’t necessary. They pass all the options tothe parent class and then don’t use them later on. It’s up to you whetheryou want your fields to be more strict about the options they select, or touse the more permissive behavior of the current fields.

The Field.init() method takes the following parameters:

Field deconstruction

The counterpoint to writing your init() method is writing thedeconstruct() method. It’s used during model migrations to tell Django how to take an instance of your new fieldand reduce it to a serialized form - in particular, what arguments to pass toinit() to re-create it.

If you haven’t added any extra options on top of the field you inherited from,then there’s no need to write a new deconstruct() method. If, however,you’re changing the arguments passed in init() (like we are inHandField), you’ll need to supplement the values being passed.

deconstruct() returns a tuple of four items: the field’s attribute name,the full import path of the field class, the positional arguments (as a list),and the keyword arguments (as a dict). Note this is different from thedeconstruct() method for custom classeswhich returns a tuple of three things.

As a custom field author, you don’t need to care about the first two values;the base Field class has all the code to work out the field’s attributename and import path. You do, however, have to care about the positionaland keyword arguments, as these are likely the things you are changing.

For example, in our HandField class we’re always forcibly settingmaxlength in _init(). The deconstruct() method on the base Fieldclass will see this and try to return it in the keyword arguments; thus,we can drop it from the keyword arguments for readability:

  1. from django.db import models
  2.  
  3. class HandField(models.Field):
  4.  
  5. def __init__(self, *args, **kwargs):
  6. kwargs['max_length'] = 104
  7. super().__init__(*args, **kwargs)
  8.  
  9. def deconstruct(self):
  10. name, path, args, kwargs = super().deconstruct()
  11. del kwargs["max_length"]
  12. return name, path, args, kwargs

If you add a new keyword argument, you need to write code in deconstruct()that puts its value into kwargs yourself. You should also omit the valuefrom kwargs when it isn’t necessary to reconstruct the state of the field,such as when the default value is being used:

  1. from django.db import models
  2.  
  3. class CommaSepField(models.Field):
  4. "Implements comma-separated storage of lists"
  5.  
  6. def __init__(self, separator=",", *args, **kwargs):
  7. self.separator = separator
  8. super().__init__(*args, **kwargs)
  9.  
  10. def deconstruct(self):
  11. name, path, args, kwargs = super().deconstruct()
  12. # Only include kwarg if it's not the default
  13. if self.separator != ",":
  14. kwargs['separator'] = self.separator
  15. return name, path, args, kwargs

More complex examples are beyond the scope of this document, but remember -for any configuration of your Field instance, deconstruct() must returnarguments that you can pass to init to reconstruct that state.

Pay extra attention if you set new default values for arguments in theField superclass; you want to make sure they’re always included, ratherthan disappearing if they take on the old default value.

In addition, try to avoid returning values as positional arguments; wherepossible, return values as keyword arguments for maximum future compatibility.Of course, if you change the names of things more often than their positionin the constructor’s argument list, you might prefer positional, but bear inmind that people will be reconstructing your field from the serialized versionfor quite a while (possibly years), depending how long your migrations live for.

You can see the results of deconstruction by looking in migrations that includethe field, and you can test deconstruction in unit tests by deconstructing andreconstructing the field:

  1. name, path, args, kwargs = my_field_instance.deconstruct()
  2. new_instance = MyField(*args, **kwargs)
  3. self.assertEqual(my_field_instance.some_attribute, new_instance.some_attribute)

Changing a custom field’s base class

You can’t change the base class of a custom field because Django won’t detectthe change and make a migration for it. For example, if you start with:

  1. class CustomCharField(models.CharField):
  2. ...

and then decide that you want to use TextField instead, you can’t changethe subclass like this:

  1. class CustomCharField(models.TextField):
  2. ...

Instead, you must create a new custom field class and update your models toreference it:

  1. class CustomCharField(models.CharField):
  2. ...
  3.  
  4. class CustomTextField(models.TextField):
  5. ...

As discussed in removing fields, youmust retain the original CustomCharField class as long as you havemigrations that reference it.

Documenting your custom field

As always, you should document your field type, so users will know what it is.In addition to providing a docstring for it, which is useful for developers,you can also allow users of the admin app to see a short description of thefield type via the django.contrib.admindocs application. To do this provide descriptivetext in a description class attribute of your custom field. Inthe above example, the description displayed by the admindocs applicationfor a HandField will be ‘A hand of cards (bridge style)’.

In the django.contrib.admindocs display, the field description isinterpolated with field.dict which allows the description toincorporate arguments of the field. For example, the description forCharField is:

  1. description = _("String (up to %(max_length)s)")

Useful methods

Once you’ve created your Field subclass, you mightconsider overriding a few standard methods, depending on your field’s behavior.The list of methods below is in approximately decreasing order of importance,so start from the top.

Custom database types

Say you’ve created a PostgreSQL custom type called mytype. You cansubclass Field and implement the db_type() method, like so:

  1. from django.db import models
  2.  
  3. class MytypeField(models.Field):
  4. def db_type(self, connection):
  5. return 'mytype'

Once you have MytypeField, you can use it in any model, just like any otherField type:

  1. class Person(models.Model):
  2. name = models.CharField(max_length=80)
  3. something_else = MytypeField()

If you aim to build a database-agnostic application, you should account fordifferences in database column types. For example, the date/time column typein PostgreSQL is called timestamp, while the same column in MySQL is calleddatetime. You can handle this in a db_type() method bychecking the connection.settings_dict['ENGINE'] attribute.

For example:

  1. class MyDateField(models.Field):
  2. def db_type(self, connection):
  3. if connection.settings_dict['ENGINE'] == 'django.db.backends.mysql':
  4. return 'datetime'
  5. else:
  6. return 'timestamp'

The db_type() and rel_db_type() methods are called byDjango when the framework constructs the CREATE TABLE statements for yourapplication – that is, when you first create your tables. The methods are alsocalled when constructing a WHERE clause that includes the model field –that is, when you retrieve data using QuerySet methods like get(),filter(), and exclude() and have the model field as an argument. Theyare not called at any other time, so it can afford to execute slightly complexcode, such as the connection.settings_dict check in the above example.

Some database column types accept parameters, such as CHAR(25), where theparameter 25 represents the maximum column length. In cases like these,it’s more flexible if the parameter is specified in the model rather than beinghard-coded in the db_type() method. For example, it wouldn’t make muchsense to have a CharMaxlength25Field, shown here:

  1. # This is a silly example of hard-coded parameters.
  2. class CharMaxlength25Field(models.Field):
  3. def db_type(self, connection):
  4. return 'char(25)'
  5.  
  6. # In the model:
  7. class MyModel(models.Model):
  8. # ...
  9. my_field = CharMaxlength25Field()

The better way of doing this would be to make the parameter specifiable at runtime – i.e., when the class is instantiated. To do that, implementField.init(), like so:

  1. # This is a much more flexible example.
  2. class BetterCharField(models.Field):
  3. def __init__(self, max_length, *args, **kwargs):
  4. self.max_length = max_length
  5. super().__init__(*args, **kwargs)
  6.  
  7. def db_type(self, connection):
  8. return 'char(%s)' % self.max_length
  9.  
  10. # In the model:
  11. class MyModel(models.Model):
  12. # ...
  13. my_field = BetterCharField(25)

Finally, if your column requires truly complex SQL setup, return None fromdb_type(). This will cause Django’s SQL creation code to skipover this field. You are then responsible for creating the column in the righttable in some other way, of course, but this gives you a way to tell Django toget out of the way.

The rel_db_type() method is called by fields such as ForeignKeyand OneToOneField that point to another field to determine their databasecolumn data types. For example, if you have an UnsignedAutoField, you alsoneed the foreign keys that point to that field to use the same data type:

  1. # MySQL unsigned integer (range 0 to 4294967295).
  2. class UnsignedAutoField(models.AutoField):
  3. def db_type(self, connection):
  4. return 'integer UNSIGNED AUTO_INCREMENT'
  5.  
  6. def rel_db_type(self, connection):
  7. return 'integer UNSIGNED'

Converting values to Python objects

If your custom Field class deals with data structures that are morecomplex than strings, dates, integers, or floats, then you may need to overridefrom_db_value() and to_python().

If present for the field subclass, from_db_value() will be called in allcircumstances when the data is loaded from the database, including inaggregates and values() calls.

to_python() is called by deserialization and during theclean() method used from forms.

As a general rule, to_python() should deal gracefully with any of thefollowing arguments:

  • An instance of the correct type (e.g., Hand in our ongoing example).
  • A string
  • None (if the field allows null=True)In our HandField class, we’re storing the data as a VARCHAR field in thedatabase, so we need to be able to process strings and None in thefrom_db_value(). In to_python(), we need to also handle Handinstances:
  1. import re
  2.  
  3. from django.core.exceptions import ValidationError
  4. from django.db import models
  5. from django.utils.translation import gettext_lazy as _
  6.  
  7. def parse_hand(hand_string):
  8. """Takes a string of cards and splits into a full hand."""
  9. p1 = re.compile('.{26}')
  10. p2 = re.compile('..')
  11. args = [p2.findall(x) for x in p1.findall(hand_string)]
  12. if len(args) != 4:
  13. raise ValidationError(_("Invalid input for a Hand instance"))
  14. return Hand(*args)
  15.  
  16. class HandField(models.Field):
  17. # ...
  18.  
  19. def from_db_value(self, value, expression, connection):
  20. if value is None:
  21. return value
  22. return parse_hand(value)
  23.  
  24. def to_python(self, value):
  25. if isinstance(value, Hand):
  26. return value
  27.  
  28. if value is None:
  29. return value
  30.  
  31. return parse_hand(value)

Notice that we always return a Hand instance from these methods. That’s thePython object type we want to store in the model’s attribute.

For to_python(), if anything goes wrong during value conversion, you shouldraise a ValidationError exception.

Converting Python objects to query values

Since using a database requires conversion in both ways, if you overrideto_python() you also have to override get_prep_value()to convert Python objects back to query values.

For example:

  1. class HandField(models.Field):
  2. # ...
  3.  
  4. def get_prep_value(self, value):
  5. return ''.join([''.join(l) for l in (value.north,
  6. value.east, value.south, value.west)])

Warning

If your custom field uses the CHAR, VARCHAR or TEXTtypes for MySQL, you must make sure that get_prep_value()always returns a string type. MySQL performs flexible and unexpectedmatching when a query is performed on these types and the providedvalue is an integer, which can cause queries to include unexpectedobjects in their results. This problem cannot occur if you alwaysreturn a string type from get_prep_value().

Converting query values to database values

Some data types (for example, dates) need to be in a specific formatbefore they can be used by a database backend.get_db_prep_value() is the method where those conversions shouldbe made. The specific connection that will be used for the query ispassed as the connection parameter. This allows you to usebackend-specific conversion logic if it is required.

For example, Django uses the following method for itsBinaryField:

  1. def get_db_prep_value(self, value, connection, prepared=False):
  2. value = super().get_db_prep_value(value, connection, prepared)
  3. if value is not None:
  4. return connection.Database.Binary(value)
  5. return value

In case your custom field needs a special conversion when being saved that isnot the same as the conversion used for normal query parameters, you canoverride get_db_prep_save().

Preprocessing values before saving

If you want to preprocess the value just before saving, you can usepre_save(). For example, Django’sDateTimeField uses this method to set the attributecorrectly in the case of auto_now orauto_now_add.

If you do override this method, you must return the value of the attribute atthe end. You should also update the model’s attribute if you make any changesto the value so that code holding references to the model will always see thecorrect value.

Specifying the form field for a model field

To customize the form field used by ModelForm, you canoverride formfield().

The form field class can be specified via the form_class andchoices_form_class arguments; the latter is used if the field has choicesspecified, the former otherwise. If these arguments are not provided,CharField or TypedChoiceFieldwill be used.

All of the kwargs dictionary is passed directly to the form field’sinit() method. Normally, all you need to do is set up a good defaultfor the form_class (and maybe choices_form_class) argument and thendelegate further handling to the parent class. This might require you to writea custom form field (and even a form widget). See the forms documentation for information about this.

Continuing our ongoing example, we can write the formfield() methodas:

  1. class HandField(models.Field):
  2. # ...
  3.  
  4. def formfield(self, **kwargs):
  5. # This is a fairly standard way to set up some defaults
  6. # while letting the caller override them.
  7. defaults = {'form_class': MyFormField}
  8. defaults.update(kwargs)
  9. return super().formfield(**defaults)

This assumes we’ve imported a MyFormField field class (which has its owndefault widget). This document doesn’t cover the details of writing custom formfields.

Emulating built-in field types

If you have created a db_type() method, you don’t need to worry aboutget_internal_type() – it won’t be used much. Sometimes, though, yourdatabase storage is similar in type to some other field, so you can use thatother field’s logic to create the right column.

For example:

  1. class HandField(models.Field):
  2. # ...
  3.  
  4. def get_internal_type(self):
  5. return 'CharField'

No matter which database backend we are using, this will mean thatmigrate and other SQL commands create the right column type forstoring a string.

If get_internal_type() returns a string that is not known to Django forthe database backend you are using – that is, it doesn’t appear indjango.db.backends.<db_name>.base.DatabaseWrapper.data_types – the stringwill still be used by the serializer, but the default db_type()method will return None. See the documentation of db_type()for reasons why this might be useful. Putting a descriptive string in as thetype of the field for the serializer is a useful idea if you’re ever going tobe using the serializer output in some other place, outside of Django.

Converting field data for serialization

To customize how the values are serialized by a serializer, you can overridevalue_to_string(). Using value_from_object() is thebest way to get the field’s value prior to serialization. For example, sinceHandField uses strings for its data storage anyway, we can reuse someexisting conversion code:

  1. class HandField(models.Field):
  2. # ...
  3.  
  4. def value_to_string(self, obj):
  5. value = self.value_from_object(obj)
  6. return self.get_prep_value(value)

Some general advice

Writing a custom field can be a tricky process, particularly if you’re doingcomplex conversions between your Python types and your database andserialization formats. Here are a couple of tips to make things go moresmoothly:

  • Look at the existing Django fields (indjango/db/models/fields/init.py) for inspiration. Try to finda field that’s similar to what you want and extend it a little bit,instead of creating an entirely new field from scratch.
  • Put a str() method on the class you’re wrapping up as a field. Thereare a lot of places where the default behavior of the field code is to callstr() on the value. (In our examples in this document, value wouldbe a Hand instance, not a HandField). So if your str()method automatically converts to the string form of your Python object, youcan save yourself a lot of work.

Writing a FileField subclass

In addition to the above methods, fields that deal with files have a few otherspecial requirements which must be taken into account. The majority of themechanics provided by FileField, such as controlling database storage andretrieval, can remain unchanged, leaving subclasses to deal with the challengeof supporting a particular type of file.

Django provides a File class, which is used as a proxy to the file’scontents and operations. This can be subclassed to customize how the file isaccessed, and what methods are available. It lives atdjango.db.models.fields.files, and its default behavior is explained in thefile documentation.

Once a subclass of File is created, the new FileField subclass must betold to use it. To do so, assign the new File subclass to the specialattr_class attribute of the FileField subclass.

A few suggestions

In addition to the above details, there are a few guidelines which can greatlyimprove the efficiency and readability of the field’s code.

  • The source for Django’s own ImageField (indjango/db/models/fields/files.py) is a great example of how tosubclass FileField to support a particular type of file, as itincorporates all of the techniques described above.
  • Cache file attributes wherever possible. Since files may be stored inremote storage systems, retrieving them may cost extra time, or evenmoney, that isn’t always necessary. Once a file is retrieved to obtainsome data about its content, cache as much of that data as possible toreduce the number of times the file must be retrieved on subsequentcalls for that information.