collection – Collection level operations

Collection level utilities for Mongo.

  • pymongo.ASCENDING = 1
  • Ascending sort order.
  • pymongo.DESCENDING = -1
  • Descending sort order.
  • pymongo.GEOHAYSTACK = 'geoHaystack'
  • Index specifier for a 2-dimensional haystack index.

New in version 2.1.

New in version 2.5.

  • pymongo.HASHED = 'hashed'
  • Index specifier for a hashed index.

New in version 2.5.

  • pymongo.TEXT = 'text'
  • Index specifier for a text index.

New in version 2.7.1.

  • class pymongo.collection.ReturnDocument
  • An enum used withfind_one_and_replace() andfind_one_and_update().

    • BEFORE
    • Return the original document before it was updated/replaced, orNone if no document matches the query.

    • AFTER

    • Return the updated/replaced or inserted document.
  • class pymongo.collection.Collection(database, name, create=False, **kwargs)
  • Get / create a Mongo collection.

Raises TypeError if name is not an instance ofbasestring (str in python 3). RaisesInvalidName if name is not a validcollection name. Any additional keyword arguments will be usedas options passed to the create command. Seecreate_collection() for validoptions.

If create is True, collation is specified, or any additionalkeyword arguments are present, a create command will besent, using session if specified. Otherwise, a create commandwill not be sent and the collection will be created implicitly on firstuse. The optional session argument is only used for the createcommand, it is not associated with the collection afterward.

Parameters:

  • database: the database to get a collection from
  • name: the name of the collection to get
  • create (optional): if True, force collectioncreation even without options being set
  • codec_options (optional): An instance ofCodecOptions. If None (thedefault) database.codec_options is used.
  • read_preference (optional): The read preference to use. IfNone (the default) database.read_preference is used.
  • write_concern (optional): An instance ofWriteConcern. If None (thedefault) database.write_concern is used.
  • read_concern (optional): An instance ofReadConcern. If None (thedefault) database.read_concern is used.
  • collation (optional): An instance ofCollation. If a collation is provided,it will be passed to the create collection command. This option isonly supported on MongoDB 3.4 and above.
  • session (optional): aClientSession that is used withthe create collection command
  • **kwargs (optional): additional keyword arguments willbe passed as options for the create collection command

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Support the collation option.

Changed in version 3.2: Added the read_concern option.

Changed in version 3.0: Added the codec_options, read_preference, and write_concern options.Removed the uuid_subtype attribute.Collection no longer returns aninstance of Collection for attributenames with leading underscores. You must use dict-style lookupsinstead::

collection[‘my_collection’]

Not:

collection.my_collection

Changed in version 2.2: Removed deprecated argument: options

New in version 2.1: uuid_subtype attribute

See also

The MongoDB documentation on

collections

  • c[name] || c.name
  • Get the name sub-collection of Collectionc.

Raises InvalidName if an invalidcollection name is used.

The full name is of the form database_name.collection_name.

  • name
  • The name of this Collection.

  • database

  • The Database that thisCollection is a part of.

  • codec_options

  • Read only access to the CodecOptionsof this instance.

  • read_preference

  • Read only access to the read preference of this instance.

Changed in version 3.0: The read_preference attribute is now read only.

  • write_concern
  • Read only access to the WriteConcernof this instance.

Changed in version 3.0: The write_concern attribute is now read only.

  • read_concern
  • Read only access to the ReadConcernof this instance.

New in version 3.2.

  • withoptions(_codec_options=None, read_preference=None, write_concern=None, read_concern=None)
  • Get a clone of this collection changing the specified settings.
  1. >>> coll1.read_preference
  2. Primary()
  3. >>> from pymongo import ReadPreference
  4. >>> coll2 = coll1.with_options(read_preference=ReadPreference.SECONDARY)
  5. >>> coll1.read_preference
  6. Primary()
  7. >>> coll2.read_preference
  8. Secondary(tag_sets=None)

Parameters:

  1. - _codec_options_ (optional): An instance of[<code>CodecOptions</code>]($f5dfe349e82ce60f.md#bson.codec_options.CodecOptions). If <code>None</code> (thedefault) the [<code>codec_options</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection.codec_options) of this [<code>Collection</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection)is used.
  2. - _read_preference_ (optional): The read preference to use. If<code>None</code> (the default) the [<code>read_preference</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection.read_preference) of this[<code>Collection</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection) is used. See [<code>read_preferences</code>]($4f651614045067e6.md#module-pymongo.read_preferences)for options.
  3. - _write_concern_ (optional): An instance of[<code>WriteConcern</code>]($1c33c29d27c9df5c.md#pymongo.write_concern.WriteConcern). If <code>None</code> (thedefault) the [<code>write_concern</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection.write_concern) of this [<code>Collection</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection)is used.
  4. - _read_concern_ (optional): An instance of[<code>ReadConcern</code>]($ba6aec8e2bd5e77f.md#pymongo.read_concern.ReadConcern). If <code>None</code> (thedefault) the [<code>read_concern</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection.read_concern) of this [<code>Collection</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection)is used.
  • bulkwrite(_requests, ordered=True, bypass_document_validation=False, session=None)
  • Send a batch of write operations to the server.

Requests are passed as a list of write operation instances (InsertOne,UpdateOne,UpdateMany,ReplaceOne,DeleteOne, orDeleteMany).

  1. >>> for doc in db.test.find({}):
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': ObjectId('54f62e60fba5226811f634ef')}
  5. {u'x': 1, u'_id': ObjectId('54f62e60fba5226811f634f0')}
  6. >>> # DeleteMany, UpdateOne, and UpdateMany are also available.
  7. ...
  8. >>> from pymongo import InsertOne, DeleteOne, ReplaceOne
  9. >>> requests = [InsertOne({'y': 1}), DeleteOne({'x': 1}),
  10. ... ReplaceOne({'w': 1}, {'z': 1}, upsert=True)]
  11. >>> result = db.test.bulk_write(requests)
  12. >>> result.inserted_count
  13. 1
  14. >>> result.deleted_count
  15. 1
  16. >>> result.modified_count
  17. 0
  18. >>> result.upserted_ids
  19. {2: ObjectId('54f62ee28891e756a6e1abd5')}
  20. >>> for doc in db.test.find({}):
  21. ... print(doc)
  22. ...
  23. {u'x': 1, u'_id': ObjectId('54f62e60fba5226811f634f0')}
  24. {u'y': 1, u'_id': ObjectId('54f62ee2fba5226811f634f1')}
  25. {u'z': 1, u'_id': ObjectId('54f62ee28891e756a6e1abd5')}

Parameters:

  1. - _requests_: A list of write operations (see examples above).
  2. - _ordered_ (optional): If <code>True</code> (the default) requests will beperformed on the server serially, in the order provided. If an erroroccurs all remaining operations are aborted. If <code>False</code> requestswill be performed on the server in arbitrary order, possibly inparallel, and all operations will be attempted.
  3. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  4. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:

An instance of BulkWriteResult.

See also

Why does PyMongo add an _id field to all of my documents?

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.6: Added session parameter.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • insertone(_document, bypass_document_validation=False, session=None)
  • Insert a single document.
  1. >>> db.test.count_documents({'x': 1})
  2. 0
  3. >>> result = db.test.insert_one({'x': 1})
  4. >>> result.inserted_id
  5. ObjectId('54f112defba522406c9cc208')
  6. >>> db.test.find_one({'x': 1})
  7. {u'x': 1, u'_id': ObjectId('54f112defba522406c9cc208')}

Parameters:

  1. - _document_: The document to insert. Must be a mutable mappingtype. If the document does not have an _id field one will beadded automatically.
  2. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  3. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  4. - An instance of [<code>InsertOneResult</code>]($01bacd612b3ce878.md#pymongo.results.InsertOneResult).

See also

Why does PyMongo add an _id field to all of my documents?

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.6: Added session parameter.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • insertmany(_documents, ordered=True, bypass_document_validation=False, session=None)
  • Insert an iterable of documents.
  1. >>> db.test.count_documents({})
  2. 0
  3. >>> result = db.test.insert_many([{'x': i} for i in range(2)])
  4. >>> result.inserted_ids
  5. [ObjectId('54f113fffba522406c9cc20e'), ObjectId('54f113fffba522406c9cc20f')]
  6. >>> db.test.count_documents({})
  7. 2

Parameters:

  1. - _documents_: A iterable of documents to insert.
  2. - _ordered_ (optional): If <code>True</code> (the default) documents will beinserted on the server serially, in the order provided. If an erroroccurs all remaining inserts are aborted. If <code>False</code>, documentswill be inserted on the server in arbitrary order, possibly inparallel, and all document inserts will be attempted.
  3. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  4. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:

An instance of InsertManyResult.

See also

Why does PyMongo add an _id field to all of my documents?

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.6: Added session parameter.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • replaceone(_filter, replacement, upsert=False, bypass_document_validation=False, collation=None, session=None)
  • Replace a single document matching the filter.
  1. >>> for doc in db.test.find({}):
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': ObjectId('54f4c5befba5220aa4d6dee7')}
  5. >>> result = db.test.replace_one({'x': 1}, {'y': 1})
  6. >>> result.matched_count
  7. 1
  8. >>> result.modified_count
  9. 1
  10. >>> for doc in db.test.find({}):
  11. ... print(doc)
  12. ...
  13. {u'y': 1, u'_id': ObjectId('54f4c5befba5220aa4d6dee7')}

The upsert option can be used to insert a new document if a matchingdocument does not exist.

  1. >>> result = db.test.replace_one({'x': 1}, {'x': 1}, True)
  2. >>> result.matched_count
  3. 0
  4. >>> result.modified_count
  5. 0
  6. >>> result.upserted_id
  7. ObjectId('54f11e5c8891e756a6e1abd4')
  8. >>> db.test.find_one({'x': 1})
  9. {u'x': 1, u'_id': ObjectId('54f11e5c8891e756a6e1abd4')}

Parameters:

  1. - _filter_: A query that matches the document to replace.
  2. - _replacement_: The new document.
  3. - _upsert_ (optional): If <code>True</code>, perform an insert if no documentsmatch the filter.
  4. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  5. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  6. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  7. - An instance of [<code>UpdateResult</code>]($01bacd612b3ce878.md#pymongo.results.UpdateResult).

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added the collation option.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • updateone(_filter, update, upsert=False, bypass_document_validation=False, collation=None, array_filters=None, session=None)
  • Update a single document matching the filter.
  1. >>> for doc in db.test.find():
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': 0}
  5. {u'x': 1, u'_id': 1}
  6. {u'x': 1, u'_id': 2}
  7. >>> result = db.test.update_one({'x': 1}, {'$inc': {'x': 3}})
  8. >>> result.matched_count
  9. 1
  10. >>> result.modified_count
  11. 1
  12. >>> for doc in db.test.find():
  13. ... print(doc)
  14. ...
  15. {u'x': 4, u'_id': 0}
  16. {u'x': 1, u'_id': 1}
  17. {u'x': 1, u'_id': 2}

Parameters:

  1. - _filter_: A query that matches the document to update.
  2. - _update_: The modifications to apply.
  3. - _upsert_ (optional): If <code>True</code>, perform an insert if no documentsmatch the filter.
  4. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  5. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  6. - _array_filters_ (optional): A list of filters specifying whicharray elements an update should apply. Requires MongoDB 3.6+.
  7. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  8. - An instance of [<code>UpdateResult</code>]($01bacd612b3ce878.md#pymongo.results.UpdateResult).

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.9: Added the ability to accept a pipeline as the update.

Changed in version 3.6: Added the array_filters and session parameters.

Changed in version 3.4: Added the collation option.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • updatemany(_filter, update, upsert=False, array_filters=None, bypass_document_validation=False, collation=None, session=None)
  • Update one or more documents that match the filter.
  1. >>> for doc in db.test.find():
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': 0}
  5. {u'x': 1, u'_id': 1}
  6. {u'x': 1, u'_id': 2}
  7. >>> result = db.test.update_many({'x': 1}, {'$inc': {'x': 3}})
  8. >>> result.matched_count
  9. 3
  10. >>> result.modified_count
  11. 3
  12. >>> for doc in db.test.find():
  13. ... print(doc)
  14. ...
  15. {u'x': 4, u'_id': 0}
  16. {u'x': 4, u'_id': 1}
  17. {u'x': 4, u'_id': 2}

Parameters:

  1. - _filter_: A query that matches the documents to update.
  2. - _update_: The modifications to apply.
  3. - _upsert_ (optional): If <code>True</code>, perform an insert if no documentsmatch the filter.
  4. - _bypass_document_validation_ (optional): If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.
  5. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  6. - _array_filters_ (optional): A list of filters specifying whicharray elements an update should apply. Requires MongoDB 3.6+.
  7. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  8. - An instance of [<code>UpdateResult</code>]($01bacd612b3ce878.md#pymongo.results.UpdateResult).

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.9: Added the ability to accept a pipeline as the update.

Changed in version 3.6: Added array_filters and session parameters.

Changed in version 3.4: Added the collation option.

Changed in version 3.2: Added bypass_document_validation support

New in version 3.0.

  • deleteone(_filter, collation=None, session=None)
  • Delete a single document matching the filter.
  1. >>> db.test.count_documents({'x': 1})
  2. 3
  3. >>> result = db.test.delete_one({'x': 1})
  4. >>> result.deleted_count
  5. 1
  6. >>> db.test.count_documents({'x': 1})
  7. 2

Parameters:

  1. - _filter_: A query that matches the document to delete.
  2. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  3. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  4. - An instance of [<code>DeleteResult</code>]($01bacd612b3ce878.md#pymongo.results.DeleteResult).

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added the collation option.

New in version 3.0.

  • deletemany(_filter, collation=None, session=None)
  • Delete one or more documents matching the filter.
  1. >>> db.test.count_documents({'x': 1})
  2. 3
  3. >>> result = db.test.delete_many({'x': 1})
  4. >>> result.deleted_count
  5. 3
  6. >>> db.test.count_documents({'x': 1})
  7. 0

Parameters:

  1. - _filter_: A query that matches the documents to delete.
  2. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  3. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:
  4. - An instance of [<code>DeleteResult</code>]($01bacd612b3ce878.md#pymongo.results.DeleteResult).

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added the collation option.

New in version 3.0.

  • aggregate(pipeline, session=None, **kwargs)
  • Perform an aggregation using the aggregation framework on thiscollection.

All optional aggregate command parameters should be passed askeyword arguments to this method. Valid options include, but are notlimited to:

  • allowDiskUse (bool): Enables writing to temporary files. When setto True, aggregation stages can write data to the _tmp subdirectoryof the –dbpath directory. The default is False.
  • maxTimeMS (int): The maximum amount of time to allow the operationto run in milliseconds.
  • batchSize (int): The maximum number of documents to return perbatch. Ignored if the connected mongod or mongos does not supportreturning aggregate results using a cursor, or useCursor isFalse.
  • collation (optional): An instance ofCollation. This option is only supportedon MongoDB 3.4 and above.
  • useCursor (bool): Deprecated. Will be removed in PyMongo 4.0.

The aggregate() method obeys the read_preference of thisCollection, except when $out or $merge are used, inwhich case PRIMARYis used.

Note

This method does not support the ‘explain’ option. Pleaseuse command() instead. Anexample is included in the Aggregation Framework documentation.

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Parameters:

  1. - _pipeline_: a list of aggregation pipeline stages
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): See list of options above.Returns:

A CommandCursor over the resultset.

Changed in version 3.9: Apply this collection’s read concern to pipelines containing the$out stage when connected to MongoDB >= 4.2.Added support for the $merge pipeline stage.Aggregations that write always use read preferencePRIMARY.

Changed in version 3.6: Added the session parameter. Added the maxAwaitTimeMS option.Deprecated the useCursor option.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4. Support the collation option.

Changed in version 3.0: The aggregate() method always returns a CommandCursor. Thepipeline argument must be a list.

Changed in version 2.7: When the cursor option is used, returnCommandCursor instead ofCursor.

Changed in version 2.6: Added cursor support.

New in version 2.3.

See also

Aggregation Examples

  • aggregateraw_batches(_pipeline, **kwargs)
  • Perform an aggregation and retrieve batches of raw BSON.

Similar to the aggregate() method but returns aRawBatchCursor.

This example demonstrates how to work with raw batches, but in practiceraw batches should be passed to an external library that can decodeBSON into another data type, rather than used with PyMongo’sbson module.

  1. >>> import bson
  2. >>> cursor = db.test.aggregate_raw_batches([
  3. ... {'$project': {'x': {'$multiply': [2, '$x']}}}])
  4. >>> for batch in cursor:
  5. ... print(bson.decode_all(batch))

Note

aggregate_raw_batches does not support sessions or autoencryption.

New in version 3.6.

  • watch(pipeline=None, full_document=None, resume_after=None, max_await_time_ms=None, batch_size=None, collation=None, start_at_operation_time=None, session=None, start_after=None)
  • Watch changes on this collection.

Performs an aggregation with an implicit initial $changeStreamstage and returns aCollectionChangeStream cursor whichiterates over changes on this collection.

Introduced in MongoDB 3.6.

  1. with db.collection.watch() as stream:
  2. for change in stream:
  3. print(change)

The CollectionChangeStream iterableblocks until the next change document is returned or an error israised. If thenext() methodencounters a network error when retrieving a batch from the server,it will automatically attempt to recreate the cursor such that nochange events are missed. Any error encountered during the resumeattempt indicates there may be an outage and will be raised.

  1. try:
  2. with db.collection.watch(
  3. [{'$match': {'operationType': 'insert'}}]) as stream:
  4. for insert_change in stream:
  5. print(insert_change)
  6. except pymongo.errors.PyMongoError:
  7. # The ChangeStream encountered an unrecoverable error or the
  8. # resume attempt failed to recreate the cursor.
  9. logging.error('...')

For a precise description of the resume process see thechange streams specification.

Note

Using this helper method is preferred to directly callingaggregate() with a$changeStream stage, for the purpose of supportingresumability.

Warning

This Collection’s read_concern must beReadConcern("majority") in order to use the $changeStreamstage.

Parameters:

  1. - _pipeline_ (optional): A list of aggregation pipeline stages toappend to an initial <code>$changeStream</code> stage. Not allpipeline stages are valid after a <code>$changeStream</code> stage, see theMongoDB documentation on change streams for the supported stages.
  2. - _full_document_ (optional): The fullDocument to pass as an optionto the <code>$changeStream</code> stage. Allowed values: updateLookup’.When set to updateLookup’, the change notification for partialupdates will include both a delta describing the changes to thedocument, as well as a copy of the entire document that waschanged from some time after the change occurred.
  3. - _resume_after_ (optional): A resume token. If provided, thechange stream will start returning changes that occur directlyafter the operation specified in the resume token. A resume tokenis the _id value of a change document.
  4. - _max_await_time_ms_ (optional): The maximum time in millisecondsfor the server to wait for changes before responding to a getMoreoperation.
  5. - _batch_size_ (optional): The maximum number of documents to returnper batch.
  6. - _collation_ (optional): The [<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation)to use for the aggregation.
  7. - _start_at_operation_time_ (optional): If provided, the resultingchange stream will only return changes that occurred at or afterthe specified [<code>Timestamp</code>]($f7f66b49fcfee58c.md#bson.timestamp.Timestamp). RequiresMongoDB &gt;= 4.0.
  8. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  9. - _start_after_ (optional): The same as _resume_after_ except that_start_after_ can resume notifications after an invalidate event.This option and _resume_after_ are mutually exclusive.Returns:

A CollectionChangeStream cursor.

Changed in version 3.9: Added the start_after parameter.

Changed in version 3.7: Added the start_at_operation_time parameter.

New in version 3.6.

See also

The MongoDB documentation on

changeStreams

  • find(filter=None, projection=None, skip=0, limit=0, no_cursor_timeout=False, cursor_type=CursorType.NON_TAILABLE, sort=None, allow_partial_results=False, oplog_replay=False, modifiers=None, batch_size=0, manipulate=True, collation=None, hint=None, max_scan=None, max_time_ms=None, max=None, min=None, return_key=False, show_record_id=False, snapshot=False, comment=None, session=None)
  • Query the database.

The filter argument is a prototype document that all resultsmust match. For example:

  1. >>> db.test.find({"hello": "world"})

only matches documents that have a key “hello” with value“world”. Matches can have other keys in addition to“hello”. The projection argument is used to specify a subsetof fields that should be included in the result documents. Bylimiting results to a certain subset of fields you can cutdown on network traffic and decoding time.

Raises TypeError if any of the arguments are ofimproper type. Returns an instance ofCursor corresponding to this query.

The find() method obeys the read_preference ofthis Collection.

Parameters:

  1. - _filter_ (optional): a SON object specifying elements whichmust be present for a document to be included in theresult set
  2. - _projection_ (optional): a list of field names that should bereturned in the result set or a dict specifying the fieldsto include or exclude. If _projection_ is a list _id willalways be returned. Use a dict to exclude fields fromthe result (e.g. projection={‘_id’: False}).
  3. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  4. - _skip_ (optional): the number of documents to omit (fromthe start of the result set) when returning the results
  5. - _limit_ (optional): the maximum number of results toreturn. A limit of 0 (the default) is equivalent to setting nolimit.
  6. - _no_cursor_timeout_ (optional): if False (the default), anyreturned cursor is closed by the server after 10 minutes ofinactivity. If set to True, the returned cursor will nevertime out on the server. Care should be taken to ensure thatcursors with no_cursor_timeout turned on are properly closed.
  7. - _cursor_type_ (optional): the type of cursor to return. The validoptions are defined by [<code>CursorType</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType):
  8. - [<code>NON_TAILABLE</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType.NON_TAILABLE) - the result ofthis find call will return a standard cursor over the result set.
  9. - [<code>TAILABLE</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType.TAILABLE) - the result of thisfind call will be a tailable cursor - tailable cursors are onlyfor use with capped collections. They are not closed when thelast data is retrieved but are kept open and the cursor locationmarks the final document position. If more data is receivediteration of the cursor will continue from the last documentreceived. For details, see the [tailable cursor documentation](http://www.mongodb.org/display/DOCS/Tailable+Cursors).
  10. - [<code>TAILABLE_AWAIT</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType.TAILABLE_AWAIT) - the resultof this find call will be a tailable cursor with the await flagset. The server will wait for a few seconds after returning thefull result set so that it can capture and return additional dataadded during the query.
  11. - [<code>EXHAUST</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType.EXHAUST) - the result of thisfind call will be an exhaust cursor. MongoDB will stream batchedresults to the client without waiting for the client to requesteach batch, reducing latency. See notes on compatibility below.
  12. - _sort_ (optional): a list of (key, direction) pairsspecifying the sort order for this query. See[<code>sort()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.sort) for details.
  13. - _allow_partial_results_ (optional): if True, mongos will returnpartial results if some shards are down instead of returning anerror.
  14. - _oplog_replay_ (optional): If True, set the oplogReplay queryflag.
  15. - _batch_size_ (optional): Limits the number of documents returned ina single batch.
  16. - _manipulate_ (optional): **DEPRECATED** - If True (the default),apply any outgoing SON manipulators before returning.
  17. - _collation_ (optional): An instance of[<code>Collation</code>]($f10fec00031f6158.md#pymongo.collation.Collation). This option is only supportedon MongoDB 3.4 and above.
  18. - _return_key_ (optional): If True, return only the index keys ineach document.
  19. - _show_record_id_ (optional): If True, adds a field <code>$recordId</code> ineach document with the storage engines internal record identifier.
  20. - _snapshot_ (optional): **DEPRECATED** - If True, prevents thecursor from returning a document more than once because of anintervening write operation.
  21. - _hint_ (optional): An index, in the same format as passed to[<code>create_index()</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.Collection.create_index) (e.g.<code>[(&#39;field&#39;, ASCENDING)]</code>). Pass this as an alternative to calling[<code>hint()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.hint) on the cursor to tell Mongo theproper index to use for the query.
  22. - _max_time_ms_ (optional): Specifies a time limit for a queryoperation. If the specified time is exceeded, the operation will beaborted and [<code>ExecutionTimeout</code>]($b888fa676d6aeac5.md#pymongo.errors.ExecutionTimeout) is raised. Passthis as an alternative to calling[<code>max_time_ms()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.max_time_ms) on the cursor.
  23. - _max_scan_ (optional): **DEPRECATED** - The maximum number ofdocuments to scan. Pass this as an alternative to calling[<code>max_scan()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.max_scan) on the cursor.
  24. - _min_ (optional): A list of field, limit pairs specifying theinclusive lower bound for all keys of a specific index in order.Pass this as an alternative to calling[<code>min()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.min) on the cursor. <code>hint</code> mustalso be passed to ensure the query utilizes the correct index.
  25. - _max_ (optional): A list of field, limit pairs specifying theexclusive upper bound for all keys of a specific index in order.Pass this as an alternative to calling[<code>max()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.max) on the cursor. <code>hint</code> mustalso be passed to ensure the query utilizes the correct index.
  26. - _comment_ (optional): A string to attach to the query to helpinterpret and trace the operation in the server logs and in profiledata. Pass this as an alternative to calling[<code>comment()</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor.comment) on the cursor.
  27. - _modifiers_ (optional): **DEPRECATED** - A dict specifyingadditional MongoDB query modifiers. Use the keyword arguments listedabove instead.

Note

There are a number of caveats to usingEXHAUST as cursor_type:

  1. - The _limit_ option can not be used with an exhaust cursor.
  2. - Exhaust cursors are not supported by mongos and can not beused with a sharded cluster.
  3. - A [<code>Cursor</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor) instance created with the[<code>EXHAUST</code>]($11aa48d96c71b56e.md#pymongo.cursor.CursorType.EXHAUST) cursor_type requires anexclusive <code>socket</code> connection to MongoDB. If the[<code>Cursor</code>]($11aa48d96c71b56e.md#pymongo.cursor.Cursor) is discarded without beingcompletely iterated the underlying <code>socket</code>connection will be closed and discarded without being returned tothe connection pool.

Changed in version 3.7: Deprecated the snapshot option, which is deprecated in MongoDB3.6 and removed in MongoDB 4.0.Deprecated the max_scan option. Support for this option isdeprecated in MongoDB 4.0. Use max_time_ms instead to limit serverside execution time.

Changed in version 3.6: Added session parameter.

Changed in version 3.5: Added the options return_key, show_record_id, snapshot,hint, max_time_ms, max_scan, min, max, and comment.Deprecated the option modifiers.

Changed in version 3.4: Support the collation option.

Changed in version 3.0: Changed the parameter names spec, fields, timeout, andpartial to filter, projection, no_cursor_timeout, andallow_partial_results respectively.Added the cursor_type, oplog_replay, and modifiers options.Removed the network_timeout, read_preference, tag_sets,secondary_acceptable_latency_ms, max_scan, snapshot,tailable, await_data, exhaust, as_class, and slaveokayparameters. Removed _compile_re option: PyMongo now alwaysrepresents BSON regular expressions as Regexobjects. Use try_compile() to attempt toconvert from a BSON regular expression to a Python regularexpression object. Soft deprecated the manipulate option.

Changed in version 2.7: Added compile_re option. If set to False, PyMongo represented BSONregular expressions as Regex objects instead ofattempting to compile BSON regular expressions as Python nativeregular expressions, thus preventing errors for some incompatiblepatterns, see PYTHON-500.

New in version 2.3: The tag_sets and secondary_acceptable_latency_ms parameters.

See also

The MongoDB documentation on

find

  • findraw_batches(_filter=None, projection=None, skip=0, limit=0, no_cursor_timeout=False, cursor_type=CursorType.NON_TAILABLE, sort=None, allow_partial_results=False, oplog_replay=False, modifiers=None, batch_size=0, manipulate=True, collation=None, hint=None, max_scan=None, max_time_ms=None, max=None, min=None, return_key=False, show_record_id=False, snapshot=False, comment=None)
  • Query the database and retrieve batches of raw BSON.

Similar to the find() method but returns aRawBatchCursor.

This example demonstrates how to work with raw batches, but in practiceraw batches should be passed to an external library that can decodeBSON into another data type, rather than used with PyMongo’sbson module.

  1. >>> import bson
  2. >>> cursor = db.test.find_raw_batches()
  3. >>> for batch in cursor:
  4. ... print(bson.decode_all(batch))

Note

find_raw_batches does not support sessions or autoencryption.

New in version 3.6.

  • findone(_filter=None, *args, **kwargs)
  • Get a single document from the database.

All arguments to find() are also valid arguments forfind_one(), although any limit argument will beignored. Returns a single document, or None if no matchingdocument is found.

The find_one() method obeys the read_preference ofthis Collection.

Parameters:

  1. -

filter (optional): a dictionary specifyingthe query to be performed OR any other type to be used asthe value for a query for "_id".

  1. -

*args (optional): any additional positional argumentsare the same as the arguments to find().

  1. -

**kwargs (optional): any additional keyword argumentsare the same as the arguments to find().

  1. >>> collection.find_one(max_time_ms=100)
  • findone_and_delete(_filter, projection=None, sort=None, session=None, **kwargs)
  • Finds a single document and deletes it, returning the document.
  1. >>> db.test.count_documents({'x': 1})
  2. 2
  3. >>> db.test.find_one_and_delete({'x': 1})
  4. {u'x': 1, u'_id': ObjectId('54f4e12bfba5220aa4d6dee8')}
  5. >>> db.test.count_documents({'x': 1})
  6. 1

If multiple documents match filter, a sort can be applied.

  1. >>> for doc in db.test.find({'x': 1}):
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': 0}
  5. {u'x': 1, u'_id': 1}
  6. {u'x': 1, u'_id': 2}
  7. >>> db.test.find_one_and_delete(
  8. ... {'x': 1}, sort=[('_id', pymongo.DESCENDING)])
  9. {u'x': 1, u'_id': 2}

The projection option can be used to limit the fields returned.

  1. >>> db.test.find_one_and_delete({'x': 1}, projection={'_id': False})
  2. {u'x': 1}

Parameters:

  1. - _filter_: A query that matches the document to delete.
  2. - _projection_ (optional): a list of field names that should bereturned in the result document or a mapping specifying the fieldsto include or exclude. If _projection_ is a list _id willalways be returned. Use a mapping to exclude fields fromthe result (e.g. projection={‘_id’: False}).
  3. - _sort_ (optional): a list of (key, direction) pairsspecifying the sort order for the query. If multiple documentsmatch the query, they are sorted and the first is deleted.
  4. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  5. - _**kwargs_ (optional): additional command arguments can be passedas keyword arguments (for example maxTimeMS can be used withrecent server versions).

Changed in version 3.6: Added session parameter.

Changed in version 3.2: Respects write concern.

Warning

Starting in PyMongo 3.2, this command uses theWriteConcern of thisCollection when connected to MongoDB >=3.2. Note that using an elevated write concern with this command maybe slower compared to using the default write concern.

Changed in version 3.4: Added the collation option.

New in version 3.0.

  • findone_and_replace(_filter, replacement, projection=None, sort=None, return_document=ReturnDocument.BEFORE, session=None, **kwargs)
  • Finds a single document and replaces it, returning either theoriginal or the replaced document.

The find_one_and_replace() method differs fromfind_one_and_update() by replacing the document matched byfilter, rather than modifying the existing document.

  1. >>> for doc in db.test.find({}):
  2. ... print(doc)
  3. ...
  4. {u'x': 1, u'_id': 0}
  5. {u'x': 1, u'_id': 1}
  6. {u'x': 1, u'_id': 2}
  7. >>> db.test.find_one_and_replace({'x': 1}, {'y': 1})
  8. {u'x': 1, u'_id': 0}
  9. >>> for doc in db.test.find({}):
  10. ... print(doc)
  11. ...
  12. {u'y': 1, u'_id': 0}
  13. {u'x': 1, u'_id': 1}
  14. {u'x': 1, u'_id': 2}

Parameters:

  1. - _filter_: A query that matches the document to replace.
  2. - _replacement_: The replacement document.
  3. - _projection_ (optional): A list of field names that should bereturned in the result document or a mapping specifying the fieldsto include or exclude. If _projection_ is a list _id willalways be returned. Use a mapping to exclude fields fromthe result (e.g. projection={‘_id’: False}).
  4. - _sort_ (optional): a list of (key, direction) pairsspecifying the sort order for the query. If multiple documentsmatch the query, they are sorted and the first is replaced.
  5. - _upsert_ (optional): When <code>True</code>, inserts a new document if nodocument matches the query. Defaults to <code>False</code>.
  6. - _return_document_: If[<code>ReturnDocument.BEFORE</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.ReturnDocument.BEFORE) (the default),returns the original document before it was replaced, or <code>None</code>if no document matches. If[<code>ReturnDocument.AFTER</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.ReturnDocument.AFTER), returns the replacedor inserted document.
  7. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  8. - _**kwargs_ (optional): additional command arguments can be passedas keyword arguments (for example maxTimeMS can be used withrecent server versions).

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added the collation option.

Changed in version 3.2: Respects write concern.

Warning

Starting in PyMongo 3.2, this command uses theWriteConcern of thisCollection when connected to MongoDB >=3.2. Note that using an elevated write concern with this command maybe slower compared to using the default write concern.

New in version 3.0.

  • findone_and_update(_filter, update, projection=None, sort=None, return_document=ReturnDocument.BEFORE, array_filters=None, session=None, **kwargs)
  • Finds a single document and updates it, returning either theoriginal or the updated document.
  1. >>> db.test.find_one_and_update(
  2. ... {'_id': 665}, {'$inc': {'count': 1}, '$set': {'done': True}})
  3. {u'_id': 665, u'done': False, u'count': 25}}

Returns None if no document matches the filter.

  1. >>> db.test.find_one_and_update(
  2. ... {'_exists': False}, {'$inc': {'count': 1}})

When the filter matches, by default find_one_and_update()returns the original version of the document before the update wasapplied. To return the updated (or inserted in the case ofupsert) version of the document instead, use the _return_document_option.

  1. >>> from pymongo import ReturnDocument
  2. >>> db.example.find_one_and_update(
  3. ... {'_id': 'userid'},
  4. ... {'$inc': {'seq': 1}},
  5. ... return_document=ReturnDocument.AFTER)
  6. {u'_id': u'userid', u'seq': 1}

You can limit the fields returned with the projection option.

  1. >>> db.example.find_one_and_update(
  2. ... {'_id': 'userid'},
  3. ... {'$inc': {'seq': 1}},
  4. ... projection={'seq': True, '_id': False},
  5. ... return_document=ReturnDocument.AFTER)
  6. {u'seq': 2}

The upsert option can be used to create the document if it doesn’talready exist.

  1. >>> db.example.delete_many({}).deleted_count
  2. 1
  3. >>> db.example.find_one_and_update(
  4. ... {'_id': 'userid'},
  5. ... {'$inc': {'seq': 1}},
  6. ... projection={'seq': True, '_id': False},
  7. ... upsert=True,
  8. ... return_document=ReturnDocument.AFTER)
  9. {u'seq': 1}

If multiple documents match filter, a sort can be applied.

  1. >>> for doc in db.test.find({'done': True}):
  2. ... print(doc)
  3. ...
  4. {u'_id': 665, u'done': True, u'result': {u'count': 26}}
  5. {u'_id': 701, u'done': True, u'result': {u'count': 17}}
  6. >>> db.test.find_one_and_update(
  7. ... {'done': True},
  8. ... {'$set': {'final': True}},
  9. ... sort=[('_id', pymongo.DESCENDING)])
  10. {u'_id': 701, u'done': True, u'result': {u'count': 17}}

Parameters:

  1. - _filter_: A query that matches the document to update.
  2. - _update_: The update operations to apply.
  3. - _projection_ (optional): A list of field names that should bereturned in the result document or a mapping specifying the fieldsto include or exclude. If _projection_ is a list _id willalways be returned. Use a dict to exclude fields fromthe result (e.g. projection={‘_id’: False}).
  4. - _sort_ (optional): a list of (key, direction) pairsspecifying the sort order for the query. If multiple documentsmatch the query, they are sorted and the first is updated.
  5. - _upsert_ (optional): When <code>True</code>, inserts a new document if nodocument matches the query. Defaults to <code>False</code>.
  6. - _return_document_: If[<code>ReturnDocument.BEFORE</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.ReturnDocument.BEFORE) (the default),returns the original document before it was updated. If[<code>ReturnDocument.AFTER</code>](https://api.mongodb.com/python/current/api/pymongo/#pymongo.collection.ReturnDocument.AFTER), returns the updatedor inserted document.
  7. - _array_filters_ (optional): A list of filters specifying whicharray elements an update should apply. Requires MongoDB 3.6+.
  8. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  9. - _**kwargs_ (optional): additional command arguments can be passedas keyword arguments (for example maxTimeMS can be used withrecent server versions).

Changed in version 3.9: Added the ability to accept a pipeline as the update.

Changed in version 3.6: Added the array_filters and session options.

Changed in version 3.4: Added the collation option.

Changed in version 3.2: Respects write concern.

Warning

Starting in PyMongo 3.2, this command uses theWriteConcern of thisCollection when connected to MongoDB >=3.2. Note that using an elevated write concern with this command maybe slower compared to using the default write concern.

New in version 3.0.

  • countdocuments(_filter, session=None, **kwargs)
  • Count the number of documents in this collection.

Note

For a fast count of the total documents in a collection seeestimated_document_count().

The count_documents() method is supported in a transaction.

All optional parameters should be passed as keyword argumentsto this method. Valid options include:

  • skip (int): The number of matching documents to skip beforereturning results.
  • limit (int): The maximum number of documents to count. Must bea positive integer. If not provided, no limit is imposed.
  • maxTimeMS (int): The maximum amount of time to allow thisoperation to run, in milliseconds.
  • collation (optional): An instance ofCollation. This option is only supportedon MongoDB 3.4 and above.
  • hint (string or list of tuples): The index to use. Specify eitherthe index name as a string or the index specification as a list oftuples (e.g. [(‘a’, pymongo.ASCENDING), (‘b’, pymongo.ASCENDING)]).This option is only supported on MongoDB 3.6 and above.

The count_documents() method obeys the read_preference ofthis Collection.

Note

When migrating from count() to count_documents()the following query operators must be replaced:

OperatorReplacement$where$expr$near$geoWithin with $center$nearSphere$geoWithin with $centerSphere

$expr requires MongoDB 3.6+

Parameters:

  1. - _filter_ (required): A query document that selects which documentsto count in the collection. Can be an empty document to count alldocuments.
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): See list of options above.

New in version 3.7.

  • estimateddocument_count(**kwargs_)
  • Get an estimate of the number of documents in this collection usingcollection metadata.

The estimated_document_count() method is not supported in atransaction.

All optional parameters should be passed as keyword argumentsto this method. Valid options include:

  • maxTimeMS (int): The maximum amount of time to allow thisoperation to run, in milliseconds.

Parameters:

  1. - _**kwargs_ (optional): See list of options above.

New in version 3.7.

  • distinct(key, filter=None, session=None, **kwargs)
  • Get a list of distinct values for key among all documentsin this collection.

Raises TypeError if key is not an instance ofbasestring (str in python 3).

All optional distinct parameters should be passed as keyword argumentsto this method. Valid options include:

  • maxTimeMS (int): The maximum amount of time to allow the countcommand to run, in milliseconds.
  • collation (optional): An instance ofCollation. This option is only supportedon MongoDB 3.4 and above.

The distinct() method obeys the read_preference ofthis Collection.

Parameters:

  1. - _key_: name of the field for which we want to get the distinctvalues
  2. - _filter_ (optional): A query document that specifies the documentsfrom which to retrieve the distinct values.
  3. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  4. - _**kwargs_ (optional): See list of options above.

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Support the collation option.

  • createindex(_keys, session=None, **kwargs)
  • Creates an index on this collection.

Takes either a single key or a list of (key, direction) pairs.The key(s) must be an instance of basestring(str in python 3), and the direction(s) must be one of(ASCENDING, DESCENDING,GEO2D, GEOHAYSTACK,GEOSPHERE, HASHED,TEXT).

To create a single key ascending index on the key 'mike' we justuse a string argument:

  1. >>> my_collection.create_index("mike")

For a compound index on 'mike' descending and 'eliot'ascending we need to use a list of tuples:

  1. >>> my_collection.create_index([("mike", pymongo.DESCENDING),
  2. ... ("eliot", pymongo.ASCENDING)])

All optional index creation parameters should be passed askeyword arguments to this method. For example:

  1. >>> my_collection.create_index([("mike", pymongo.DESCENDING)],
  2. ... background=True)

Valid options include, but are not limited to:

  • name: custom name to use for this index - if none isgiven, a name will be generated.
  • unique: if True creates a uniqueness constraint on the index.
  • background: if True this index should be created in thebackground.
  • sparse: if True, omit from the index any documents that lackthe indexed field.
  • bucketSize: for use with geoHaystack indexes.Number of documents to group together within a certain proximityto a given longitude and latitude.
  • min: minimum value for keys in a GEO2Dindex.
  • max: maximum value for keys in a GEO2Dindex.
  • expireAfterSeconds: <int> Used to create an expiring (TTL)collection. MongoDB will automatically delete documents fromthis collection after <int> seconds. The indexed field mustbe a UTC datetime or the data will not expire.
  • partialFilterExpression: A document that specifies a filter fora partial index. Requires server version >=3.2.
  • collation (optional): An instance ofCollation. This option is only supportedon MongoDB 3.4 and above.
  • wildcardProjection: Allows users to include or exclude specificfield paths from a wildcard index using the { “$**” : 1} keypattern. Requires server version >= 4.2.

See the MongoDB documentation for a full list of supported options byserver version.

Warning

dropDups is not supported by MongoDB 3.0 or newer. Theoption is silently ignored by the server and unique index buildsusing the option will fail if a duplicate value is detected.

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Parameters:

  1. - _keys_: a single key or a list of (key, direction)pairs specifying the index to create
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): any additional index creationoptions (see the above list) should be passed as keywordarguments

Changed in version 3.6: Added session parameter. Added support for passing maxTimeMSin kwargs.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4. Support the collation option.

Changed in version 3.2: Added partialFilterExpression to support partial indexes.

Changed in version 3.0: Renamed key_or_list to keys. Removed the cache_for option.create_index() no longer caches index names. Removed supportfor the drop_dups and bucket_size aliases.

See also

The MongoDB documentation on

indexes

  • createindexes(_indexes, session=None, **kwargs)
  • Create one or more indexes on this collection.
  1. >>> from pymongo import IndexModel, ASCENDING, DESCENDING
  2. >>> index1 = IndexModel([("hello", DESCENDING),
  3. ... ("world", ASCENDING)], name="hello_world")
  4. >>> index2 = IndexModel([("goodbye", DESCENDING)])
  5. >>> db.test.create_indexes([index1, index2])
  6. ["hello_world", "goodbye_-1"]

Parameters:

  1. - _indexes_: A list of [<code>IndexModel</code>]($a074d549d722ba31.md#pymongo.operations.IndexModel)instances.
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): optional arguments to the createIndexescommand (like maxTimeMS) can be passed as keyword arguments.

Note

create_indexes uses the createIndexes commandintroduced in MongoDB 2.6 and cannot be used with earlierversions.

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Changed in version 3.6: Added session parameter. Added support for arbitrary keywordarguments.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

New in version 3.0.

  • dropindex(_index_or_name, session=None, **kwargs)
  • Drops the specified index on this collection.

Can be used on non-existant collections or collections with noindexes. Raises OperationFailure on an error (e.g. trying todrop an index that does not exist). index_or_name_can be either an index name (as returned by _create_index),or an index specifier (as passed to create_index). An indexspecifier should be a list of (key, direction) pairs. RaisesTypeError if index is not an instance of (str, unicode, list).

Warning

if a custom name was used on index creation (bypassing the name parameter to create_index() orensure_index()) the index must be dropped by name.

Parameters:

  1. - _index_or_name_: index (or name of index) to drop
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): optional arguments to the createIndexescommand (like maxTimeMS) can be passed as keyword arguments.

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Changed in version 3.6: Added session parameter. Added support for arbitrary keywordarguments.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

  • dropindexes(_session=None, **kwargs)
  • Drops all indexes on this collection.

Can be used on non-existant collections or collections with no indexes.Raises OperationFailure on an error.

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  2. - _**kwargs_ (optional): optional arguments to the createIndexescommand (like maxTimeMS) can be passed as keyword arguments.

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Changed in version 3.6: Added session parameter. Added support for arbitrary keywordarguments.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

  • reindex(session=None, **kwargs)
  • Rebuilds all indexes on this collection.

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  2. - _**kwargs_ (optional): optional arguments to the reIndexcommand (like maxTimeMS) can be passed as keyword arguments.

Warning

reindex blocks all other operations (indexesare built in the foreground) and will be slow for largecollections.

Changed in version 3.6: Added session parameter. Added support for arbitrary keywordarguments.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

Changed in version 3.5: We no longer apply this collection’s write concern to this operation.MongoDB 3.4 silently ignored the write concern. MongoDB 3.6+ returnsan error if we include the write concern.

  • listindexes(_session=None)
  • Get a cursor over the index documents for this collection.
  1. >>> for index in db.test.list_indexes():
  2. ... print(index)
  3. ...
  4. SON([(u'v', 1), (u'key', SON([(u'_id', 1)])),
  5. (u'name', u'_id_'), (u'ns', u'test.test')])

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).Returns:

An instance of CommandCursor.

Changed in version 3.6: Added session parameter.

New in version 3.0.

  • indexinformation(_session=None)
  • Get information on this collection’s indexes.

Returns a dictionary where the keys are index names (asreturned by create_index()) and the values are dictionariescontaining information about each index. The dictionary isguaranteed to contain at least a single key, "key" whichis a list of (key, direction) pairs specifying the index (aspassed to create_index()). It will also contain any othermetadata about the indexes, except for the "ns" and"name" keys, which are cleaned. Example output might looklike this:

  1. >>> db.test.create_index("x", unique=True)
  2. u'x_1'
  3. >>> db.test.index_information()
  4. {u'_id_': {u'key': [(u'_id', 1)]},
  5. u'x_1': {u'unique': True, u'key': [(u'x', 1)]}}

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).

Changed in version 3.6: Added session parameter.

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).

The following two calls are equivalent:

  1. >>> db.foo.drop()
  2. >>> db.drop_collection("foo")

Changed in version 3.7: drop() now respects this Collection’s write_concern.

Changed in version 3.6: Added session parameter.

  • rename(new_name, session=None, **kwargs)
  • Rename this collection.

If operating in auth mode, client must be authorized as anadmin to perform this operation. Raises TypeError ifnew_name is not an instance of basestring(str in python 3). Raises InvalidNameif new_name is not a valid collection name.

Parameters:

  1. - _new_name_: new name for this collection
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): additional arguments to the rename commandmay be passed as keyword arguments to this helper method(i.e. <code>dropTarget=True</code>)

Note

The write_concern ofthis collection is automatically applied to this operation when usingMongoDB >= 3.4.

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

  • options(session=None)
  • Get the options set on this collection.

Returns a dictionary of options and their values - seecreate_collection() for moreinformation on the possible options. Returns an emptydictionary if the collection has not been created yet.

Parameters:

  1. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).

Changed in version 3.6: Added session parameter.

  • mapreduce(_map, reduce, out, full_response=False, session=None, **kwargs)
  • Perform a map/reduce operation on this collection.

If full_response is False (default) returns aCollection instance containingthe results of the operation. Otherwise, returns the fullresponse from the server to the map reduce command.

Parameters:

  1. -

map: map function (as a JavaScript string)

  1. -

reduce: reduce function (as a JavaScript string)

  1. -

out: output collection name or out object (dict). Seethe map reduce command documentation for available options.Note: out options are order sensitive. SONcan be used to specify multiple options.e.g. SON([(‘replace’, ), (‘db’, )])

  1. -

full_response (optional): if True, return full response tothis command - otherwise just return the result collection

  1. -

session (optional): aClientSession.

  1. -

**kwargs (optional): additional arguments to themap reduce command may be passed as keyword arguments to thishelper method, e.g.:

  1. >>> db.test.map_reduce(map, reduce, "myresults", limit=2)

Note

The map_reduce() method does not obey theread_preference of this Collection. To runmapReduce on a secondary use the inline_map_reduce() methodinstead.

Note

The write_concern ofthis collection is automatically applied to this operation (if theoutput is not inline) when using MongoDB >= 3.4.

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Apply this collection’s write concern automatically to this operationwhen connected to MongoDB >= 3.4.

See also

Aggregation Examples

Changed in version 3.4: Added the collation option.

Changed in version 2.2: Removed deprecated arguments: merge_output and reduce_output

See also

The MongoDB documentation on

mapreduce

  • inlinemap_reduce(_map, reduce, full_response=False, session=None, **kwargs)
  • Perform an inline map/reduce operation on this collection.

Perform the map/reduce operation on the server in RAM. A resultcollection is not created. The result set is returned as a listof documents.

If full_response is False (default) returns theresult documents in a list. Otherwise, returns the fullresponse from the server to the map reduce command.

The inline_map_reduce() method obeys the read_preferenceof this Collection.

Parameters:

  1. -

map: map function (as a JavaScript string)

  1. -

reduce: reduce function (as a JavaScript string)

  1. -

full_response (optional): if True, return full response tothis command - otherwise just return the result collection

  1. -

session (optional): aClientSession.

  1. -

**kwargs (optional): additional arguments to themap reduce command may be passed as keyword arguments to thishelper method, e.g.:

  1. >>> db.test.inline_map_reduce(map, reduce, limit=2)

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added the collation option.

  • parallelscan(_num_cursors, session=None, **kwargs)
  • DEPRECATED: Scan this entire collection in parallel.

Returns a list of up to num_cursors cursors that can be iteratedconcurrently. As long as the collection is not modified duringscanning, each document appears once in one of the cursors resultsets.

For example, to process each document in a collection using somethread-safe process_document() function:

  1. >>> def process_cursor(cursor):
  2. ... for document in cursor:
  3. ... # Some thread-safe processing function:
  4. ... process_document(document)
  5. >>>
  6. >>> # Get up to 4 cursors.
  7. ...
  8. >>> cursors = collection.parallel_scan(4)
  9. >>> threads = [
  10. ... threading.Thread(target=process_cursor, args=(cursor,))
  11. ... for cursor in cursors]
  12. >>>
  13. >>> for thread in threads:
  14. ... thread.start()
  15. >>>
  16. >>> for thread in threads:
  17. ... thread.join()
  18. >>>
  19. >>> # All documents have now been processed.

The parallel_scan() method obeys the read_preference ofthis Collection.

Parameters:

  1. - _num_cursors_: the number of cursors to return
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_: additional options for the parallelCollectionScancommand can be passed as keyword arguments.

Note

Requires server version >= 2.5.5.

Changed in version 3.7: Deprecated.

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Added back support for arbitrary keyword arguments. MongoDB 3.4adds support for maxTimeMS as an option to theparallelCollectionScan command.

Changed in version 3.0: Removed support for arbitrary keyword arguments, sincethe parallelCollectionScan command has no optional arguments.

  • initializeunordered_bulk_op(_bypass_document_validation=False)
  • DEPRECATED - Initialize an unordered batch of write operations.

Operations will be performed on the server in arbitrary order,possibly in parallel. All operations will be attempted.

Parameters:

  1. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.

Returns a BulkOperationBuilder instance.

See Unordered Bulk Write Operations for examples.

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.5: Deprecated. Use bulk_write()instead.

Changed in version 3.2: Added bypass_document_validation support

New in version 2.7.

  • initializeordered_bulk_op(_bypass_document_validation=False)
  • DEPRECATED - Initialize an ordered batch of write operations.

Operations will be performed on the server serially, in theorder provided. If an error occurs all remaining operationsare aborted.

Parameters:

  1. - _bypass_document_validation_: (optional) If <code>True</code>, allows thewrite to opt-out of document level validation. Default is<code>False</code>.

Returns a BulkOperationBuilder instance.

See Ordered Bulk Write Operations for examples.

Note

bypass_document_validation requires server version>= 3.2

Changed in version 3.5: Deprecated. Use bulk_write()instead.

Changed in version 3.2: Added bypass_document_validation support

New in version 2.7.

  • group(key, condition, initial, reduce, finalize=None, **kwargs)
  • Perform a query similar to an SQL group by operation.

DEPRECATED - The group command was deprecated in MongoDB 3.4. Thegroup() method is deprecated and will be removed in PyMongo 4.0.Use aggregate() with the $group stage or map_reduce()instead.

Changed in version 3.5: Deprecated the group method.

Changed in version 3.4: Added the collation option.

Changed in version 2.2: Removed deprecated argument: command

  • count(filter=None, session=None, **kwargs)
  • DEPRECATED - Get the number of documents in this collection.

The count() method is deprecated and not supported in atransaction. Please use count_documents() orestimated_document_count() instead.

All optional count parameters should be passed as keyword argumentsto this method. Valid options include:

  • skip (int): The number of matching documents to skip beforereturning results.
  • limit (int): The maximum number of documents to count. A limitof 0 (the default) is equivalent to setting no limit.
  • maxTimeMS (int): The maximum amount of time to allow the countcommand to run, in milliseconds.
  • collation (optional): An instance ofCollation. This option is only supportedon MongoDB 3.4 and above.
  • hint (string or list of tuples): The index to use. Specify eitherthe index name as a string or the index specification as a list oftuples (e.g. [(‘a’, pymongo.ASCENDING), (‘b’, pymongo.ASCENDING)]).

The count() method obeys the read_preference ofthis Collection.

Note

When migrating from count() to count_documents()the following query operators must be replaced:

OperatorReplacement$where$expr$near$geoWithin with $center$nearSphere$geoWithin with $centerSphere

$expr requires MongoDB 3.6+

Parameters:

  1. - _filter_ (optional): A query document that selects which documentsto count in the collection.
  2. - _session_ (optional): a[<code>ClientSession</code>]($9cd063bf36ed4635.md#pymongo.client_session.ClientSession).
  3. - _**kwargs_ (optional): See list of options above.

Changed in version 3.7: Deprecated.

Changed in version 3.6: Added session parameter.

Changed in version 3.4: Support the collation option.

  • insert(doc_or_docs, manipulate=True, check_keys=True, continue_on_error=False, **kwargs)
  • Insert a document(s) into this collection.

DEPRECATED - Use insert_one() or insert_many() instead.

Changed in version 3.0: Removed the safe parameter. Pass w=0 for unacknowledged writeoperations.

  • save(to_save, manipulate=True, check_keys=True, **kwargs)
  • Save a document in this collection.

DEPRECATED - Use insert_one() or replace_one() instead.

Changed in version 3.0: Removed the safe parameter. Pass w=0 for unacknowledged writeoperations.

  • update(spec, document, upsert=False, manipulate=False, multi=False, check_keys=True, **kwargs)
  • Update a document(s) in this collection.

DEPRECATED - Use replace_one(), update_one(), orupdate_many() instead.

Changed in version 3.0: Removed the safe parameter. Pass w=0 for unacknowledged writeoperations.

  • remove(spec_or_id=None, multi=True, **kwargs)
  • Remove a document(s) from this collection.

DEPRECATED - Use delete_one() or delete_many() instead.

Changed in version 3.0: Removed the safe parameter. Pass w=0 for unacknowledged writeoperations.

  • findand_modify(_query={}, update=None, upsert=False, sort=None, full_response=False, manipulate=False, **kwargs)
  • Update and return an object.

DEPRECATED - Use find_one_and_delete(),find_one_and_replace(), or find_one_and_update() instead.

  • ensureindex(_key_or_list, cache_for=300, **kwargs)
  • DEPRECATED - Ensures that an index exists on this collection.

Changed in version 3.0: DEPRECATED