Collations

See also

The API docs for collation.

Collations are a new feature in MongoDB version 3.4. They provide a set of rulesto use when comparing strings that comply with the conventions of a particularlanguage, such as Spanish or German. If no collation is specified, the serversorts strings based on a binary comparison. Many languages have specificordering rules, and collations allow users to build applications that adhere tolanguage-specific comparison rules.

In French, for example, the last accent in a given word determines the sortingorder. The correct sorting order for the following four words in French is:

  1. cote < côte < coté < côté

Specifying a French collation allows users to sort string fields using theFrench sort order.

Usage

Users can specify a collation for acollection, anindex, or aCRUD command.

Collation Parameters:

Collations can be specified with the Collation modelor with plain Python dictionaries. The structure is the same:

  1. Collation(locale=<string>,
  2. caseLevel=<bool>,
  3. caseFirst=<string>,
  4. strength=<int>,
  5. numericOrdering=<bool>,
  6. alternate=<string>,
  7. maxVariable=<string>,
  8. backwards=<bool>)

The only required parameter is locale, which the server parses asan ICU format locale ID.For example, set locale to en_US to represent US Englishor fr_CA to represent Canadian French.

For a complete description of the available parameters, see the MongoDB manual.

Assign a Default Collation to a Collection

The following example demonstrates how to create a new collection calledcontacts and assign a default collation with the fr_CA locale. Thisoperation ensures that all queries that are run against the contactscollection use the fr_CA collation unless another collation is explicitlyspecified:

  1. from pymongo import MongoClient
  2. from pymongo.collation import Collation
  3.  
  4. db = MongoClient().test
  5. collection = db.create_collection('contacts',
  6. collation=Collation(locale='fr_CA'))

Assign a Default Collation to an Index

When creating a new index, you can specify a default collation.

The following example shows how to create an index on the namefield of the contacts collection, with the unique parameterenabled and a default collation with locale set to fr_CA:

  1. from pymongo import MongoClient
  2. from pymongo.collation import Collation
  3.  
  4. contacts = MongoClient().test.contacts
  5. contacts.create_index('name',
  6. unique=True,
  7. collation=Collation(locale='fr_CA'))

Specify a Collation for a Query

Individual queries can specify a collation to use when sortingresults. The following example demonstrates a query that runs on thecontacts collection in database test. It matches ondocuments that contain New York in the city field,and sorts on the name field with the fr_CA collation:

  1. from pymongo import MongoClient
  2. from pymongo.collation import Collation
  3.  
  4. collection = MongoClient().test.contacts
  5. docs = collection.find({'city': 'New York'}).sort('name').collation(
  6. Collation(locale='fr_CA'))

Other Query Types

You can use collations to control document matching rules for several differenttypes of queries. All the various update and delete methods(update_one(),update_many(),delete_one(), etc.) support collation, andyou can create query filters which employ collations to comply with any of thelanguages and variants available to the locale parameter.

The following example uses a collation with strength set toSECONDARY, which considers onlythe base character and character accents in string comparisons, but not casesensitivity, for example. All documents in the contacts collection withjürgen (case-insensitive) in the first_name field are updated:

  1. from pymongo import MongoClient
  2. from pymongo.collation import Collation, CollationStrength
  3.  
  4. contacts = MongoClient().test.contacts
  5. result = contacts.update_many(
  6. {'first_name': 'jürgen'},
  7. {'$set': {'verified': 1}},
  8. collation=Collation(locale='de',
  9. strength=CollationStrength.SECONDARY))