Distributed Queries

Distributed Queries

Read Operations to Replica Sets

By default, clients reads from a replica set’s primary;however, clients can specify a read preference to direct read operations to other members.For example, clients can configure read preferences to read fromsecondaries or from nearest member to:

reduce latency in multi-data-center deployments,
improve read throughput by distributing high read-volumes (relativeto write volume),
perform backup operations, and/or
allow reads until a new primary is elected.

Read operations from secondary members of replica sets may not reflectthe current state of the primary. Read preferences that direct readoperations to different servers may result in non-monotonic reads.

Changed in version 3.6: Starting in MongoDB 3.6, clients can use causally consistent sessions, which provides various guarantees,including monotonic reads.

You can configure the read preference on a per-connection orper-operation basis. For more information on read preference or on theread preference modes, see Read Preference andRead Preference Modes.

Write Operations on Replica Sets

In replica sets, all write operations go to theset’s primary. The primary applies the write operation andrecords the operations on the primary’s operation log or oplog.The oplog is a reproducible sequence of operations to the data set.Secondary members of the set continuously replicate the oplogand apply the operations to themselves in an asynchronous process.

For more information on replica sets and write operations, seeReplication andWrite Concern.

Read Operations to Sharded Clusters

Sharded clusters allow you to partition a dataset among a cluster of mongod instances in a way that isnearly transparent to the application. For an overview of shardedclusters, see the Sharding section of this manual.

For a sharded cluster, applications issue operations to one of themongos instances associated with the cluster.

Read operations on sharded clusters are most efficient when directed toa specific shard. Queries to sharded collections should include thecollection’s shard key. When a queryincludes a shard key, the mongos can use cluster metadatafrom the config database to route thequeries to shards.

If a query does not include the shard key, the mongos mustdirect the query to all shards in the cluster. These scattergather queries can be inefficient. On larger clusters, scatter gatherqueries are unfeasible for routine operations.

For replica set shards, read operations from secondary members ofreplica sets may not reflect the current state of the primary. Readpreferences that direct read operations to different servers may resultin non-monotonic reads.

Note

Starting in MongoDB 3.6,

Clients can use causally consistentsessions, which provides various guarantees, including monotonicreads.
All members of a shard replica set, not just the primary, maintainthe metadata regarding chunk metadata. This prevents reads fromthe secondaries from returning orphaned data if not using read concern "available".In earlier versions, reads from secondaries, regardless of theread concern, could return orphaned documents.

For more information on read operations in sharded clusters, see themongos and Shard Keyssections.

Write Operations on Sharded Clusters

For sharded collections in a sharded cluster, themongos directs write operations from applications to theshards that are responsible for the specific portion of the dataset. The mongos uses the cluster metadata from theconfig database to route the writeoperation to the appropriate shards.

MongoDB partitions data in a sharded collection into ranges based onthe values of the shard key. Then, MongoDB distributes thesechunks to shards. The shard key determines the distribution of chunks toshards. This can affect the performance of write operations in thecluster.

Important

Update operations that affect a single documentmust include the shard key or the _idfield. Updates that affect multiple documents are more efficient insome situations if they have the shard key, but can bebroadcast to all shards.

If the value of the shard key increases or decreases with everyinsert, all insert operations target a single shard. As a result, thecapacity of a single shard becomes the limit for the insert capacityof the sharded cluster.

For more information, see Sharding andBulk Write Operations.