Read Preference Reference

Read preference describes how MongoDB clients route read operations tothe members of a replica set.

Read operations to a replica set. Default read preference routes the read to the primary. Read preference of ``nearest`` routes the read to the nearest member.

By default, an application directs its read operations to theprimary member in a replica set (i.e. read preferencemode “primary”). But, clients can specify a read preference to sendread operations to secondaries.

Read Preference ModeDescription
primaryDefault mode. All operations read from the current replica setprimary.Multi-document transactions that containread operations must use read preference primary. Alloperations in a given transaction must route to the same member.
primaryPreferredIn most situations, operations read from the primary butif it is unavailable, operations read from secondarymembers.
secondaryAll operations read from the secondary members of thereplica set.
secondaryPreferredIn most situations, operations read from secondarymembers but if no secondary members are available,operations read from the primary.
nearestOperations read from member of the replicaset with the least network latency, irrespective of the member’s type.

Note

The read preference does not affect the visibility of data; i.e.,the read preference does not affect whether clients can see databefore they are made durable. For information on read isolationlevel in MongoDB, seeRead Isolation, Consistency, and Recency.

Read preference also does not affect causal consistency. The causal consistency guarantees provided by causally consistent sessions for readoperations with "majority" read concern and writeoperations with "majority" write concern hold acrossall members of the MongoDB deployment.

Read Preference Modes

  • primary
  • All read operations use only the current replica setprimary. [3] This is the default readmode. If the primary is unavailable, read operations produce anerror or throw an exception.

The primary read preference mode is not compatible withread preference modes that use tag sets or maxStalenessSeconds.If you specify tag sets or a maxStalenessSeconds valuewith primary, the driver will produce an error.

Multi-document transactions that containread operations must use read preference primary. Alloperations in a given transaction must route to the same member.

  • primaryPreferred
  • In most situations, operations read from the primary memberof the set. However, if the primary is unavailable, as is the caseduring failover situations, operations read from secondarymembers that satisfy the read preference’s maxStalenessSeconds andtag sets.

When the primaryPreferred read preference includes a maxStalenessSeconds value and there is no primary from which to read,the client estimates how stale eachsecondary is by comparing the secondary’s last writeto that of the secondary with the most recent write. The client then directs the read operation to asecondary whose estimated lag is less than or equal to maxStalenessSeconds.

When the read preference includes a tag set (i.e. a list of tagspecifications) and there is no primary from which to read,the client attempts to find secondary members with matching tags(trying the tag specifications in order until a match is found). Ifmatching secondaries are found, the client selects arandom secondary from the nearest group of matchingsecondaries. If no secondaries have matching tags, the read operation produces an error.

When the read preference includes a maxStalenessSeconds valueand a tag set, the client filters by staleness first andthen by the specified tags.

Read operations using the primaryPreferred mode may return stale data. Use themaxStalenessSeconds option to avoid reading from secondariesthat the client estimates are overly stale.

  • secondary
  • Operations read only from the secondary members of the set.If no secondaries are available, then this read operation produces anerror or exception.

Most replica sets have at least one secondary, but there aresituations where there may be no available secondary. For example, areplica set with a primary, a secondary, and anarbiter may not have any secondaries if a member is inrecovering state or unavailable.

When the secondary read preference includes a maxStalenessSeconds value,the client estimates how stale eachsecondary is by comparing the secondary’s last writeto that of the primary. The client then directs the read operation to asecondary whose estimated lag is less than or equal to maxStalenessSeconds.If there is no primary, the client uses the secondary with the mostrecent write for the comparison.

When the read preference includes a tag set (i.e. a list of tagspecifications),the client attempts to find secondary members with matching tags(trying the tag specifications in order until a match is found). Ifmatching secondaries are found, the client selects arandom secondary from the nearest group of matchingsecondaries. If no secondaries have matching tags, the read operation produces an error.

When the read preference includes a maxStalenessSeconds valueand a tag set, the client filters by staleness first andthen by the specified tags.

Read operations using the secondary mode may return stale data. Use themaxStalenessSeconds option to avoid reading from secondariesthat the client estimates are overly stale.

  • secondaryPreferred
  • In most situations, operations read from secondary members,but in situations where the set consists of a singleprimary (and no other members), the read operation will usethe replica set’s primary.

When the secondaryPreferred read preference includes a maxStalenessSeconds value,the client estimates how stale eachsecondary is by comparing the secondary’s last writeto that of the primary. The client then directs the read operation to asecondary whose estimated lag is less than or equal to maxStalenessSeconds.If there is no primary, the client uses the secondary with the mostrecent write for the comparison. If there are no secondaries withestimated lag less than or equal to maxStalenessSeconds, theclient directs the read operation to the replica set’s primary.

When the read preference includes a tag set (i.e. a list of tagspecifications),the client attempts to find secondary members with matching tags(trying the tag specifications in order until a match is found). Ifmatching secondaries are found, the client selects arandom secondary from the nearest group of matchingsecondaries. If no secondaries have matching tags, the client ignores tags and reads from the primary.

When the read preference includes a maxStalenessSeconds valueand a tag set, the client filters by staleness first andthen by the specified tags.

Read operations using the secondaryPreferred mode may return stale data. Use themaxStalenessSeconds option to avoid reading from secondariesthat the client estimates are overly stale.

  • nearest
  • The driver reads from a member whose network latency falls withinthe acceptable latency window. Reads in the nearest modedo not consider whether a member is a primary or secondary whenrouting read operations: primaries and secondaries are treatedequivalently. The read preference member selection documentationdescribes the process in detail.

Set this mode to minimize the effect of network latencyon read operations without preference for current or stale data.

When the read preference includes a maxStalenessSeconds value, the client estimateshow stale each secondary is by comparing the secondary’s last writeto that of the primary, if available, or to the secondary with themost recent write if there is no primary. The client will thenfilter out any secondary whose estimated lag is greater thanmaxStalenessSeconds and randomly direct the read to a remainingmember (primary or secondary) whose network latency falls within theacceptable latency window.

If you specify a tag set, the client attempts tofind a replica set member that matches the specified tag sets anddirects reads to an arbitrary member from among the nearestgroup.

When the read preference includes a maxStalenessSeconds valueand a tag set, the client filters by staleness first andthen by the specified tags. From the remaining mongod instances, the client thenrandomly directs the read to an instance that falls within theacceptable latency window. The read preference memberselectiondocumentation describes the process in detail.

Read operations using the nearest mode may return stale data. Use themaxStalenessSeconds option to avoid reading from secondariesthat the client estimates are overly stale.

Note

All operations read from a member of the nearest group of thereplica set that matches the specified read preference mode. Thenearest mode differs in that it prefers low latencyreads over a member’s primary or secondary status.

For nearest, the client assembles a list ofacceptable hosts based on maxStalenessSeconds and tag setsand then narrows that list to the host with the shortest pingtime and all other members of the set that are within the “localthreshold,” or acceptable latency. SeeRead Preference for Replica Sets for moreinformation.

Tag Set

If a replica set member or members are associated withtags, you can specify a tag set (array of tagspecification documents) in the read preference to target those members.

To configure a member withtags, set members[n].tags to a document that contains the tagname and value pairs. The value of the tags must be a string.

  1. { "<tag1>": "<string1>", "<tag2>": "<string2>",... }

Then, you can include a tag set in the read preference to target taggedmembers. A tag set is an array of tag specification documents, whereeach tag specification document contains one or more tag/value pairs.

  1. [ { "<tag1>": "<string1>", "<tag2>": "<string2>",... }, ... ]

To find replica set members, MongoDB tries each document in successionuntil a match is found. See Order of Tag Matching for details.

For example, if a secondary member has the followingmembers[n].tags:

  1. { "region": "South", "datacenter": "A" }

Then, the following tags sets can direct read operations to the aforementionedsecondary (or other members with the same tags):

  1. [ { "region": "South", "datacenter": "A" }, { } ] // Find members with both tag values. If none are found, read from any eligible member.
  2. [ { "region": "South" }, { "datacenter": "A" }, { } ] // Find members with the specified region tag. Only if not found, then find members with the specified datacenter tag. If none are found, read from any eligible member.
  3. [ { "datacenter": "A" }, { "region": "South" }, { } ] // Find members with the specified datacenter tag. Only if not found, then find members with the specified region tag. If none are found, read from any eligible member.
  4. [ { "region": "South" }, { } ] // Find members with the specified region tag value. If none are found, read from any eligible member.
  5. [ { "datacenter": "A" }, { } ] // Find members with the specified datacenter tag value. If none are found, read from any eligible member.
  6. [ { } ] // Find any eligible member.

Order of Tag Matching

If the tag set lists multiple documents, MongoDB tries each document insuccession until a match is found. Once a match is found, that tagspecification document is used to find all eligible matching members,and the remaining tag specification documents are ignored. If nomembers match any of the tag specification documents, the readoperation returns with an error.

Tip

To avoid an error if no members match any of the tag specifications,you can add an empty document { } as the last element of the tagset to read from any eligible member.

For example, consider the following tag set with three tagspecification documents:

  1. [ { "region": "South", "datacenter": "A" }, { "rack": "rack-1" }, { } ]

First, MongoDB tries to find members tagged with both "region":"South" and "datacenter": "A".

  1. { "region": "South", "datacenter": "A" }
  • If a member is found, the remaining tag specification documents arenot considered. Instead, MongoDB uses this tag specification documentto find all eligible members.

  • Else, MongoDB tries to find members with the tags specified in thesecond document

  1. { "rack": "rack-1" }
  • If a member is found tagged, the remaining tag specificationdocument is not considered. Instead, MongoDB uses this tagspecification document to find all eligible members.

  • Else, the third document is considered.

  1. { }

The empty document matches any eligible member.

Tag Set and Read Preference Modes

Tags are not compatible with mode primary and, in general, onlyapply when selectinga secondary member of a set for a read operation. However, thenearest read mode, when combined with a tag set, selectsthe matching member with the lowest network latency. This member may be aprimary or secondary.

ModeNotes
primaryPreferredSpecified tag set only applies if selecting eligible secondaries.
secondarySpecified tag set always applies.
secondaryPreferredSpecified tag set only applies if selecting eligible secondaries.
nearestSpecified tag set applies whether selecting either primary or eligible secondaries.

For information on the interaction between the modes and tag sets, refer to thespecific read preference mode documentation.

For information on configuring tag sets, see theConfigure Replica Set Tag Sets tutorial.

Configure Read Preference

When using a MongoDB driver, you can specify the read preference whenconnecting to the replica set or sharded cluster. For example, see connectionstring. You can also specify the readpreference at a more granular level. For details, see your driver’sapi documentation.

When using the mongo shell, see cursor.readPref()and Mongo.setReadPref(). For example:

  1. db.collection.find({}).readPref( "secondary", [ { "region": "South" } ] )

Use Cases

Depending on the requirements of an application, you can configuredifferent applications to usedifferent read preferences, or use different read preferences for differentqueries in the same application. Consider the following applicationsfor different read preference strategies.

Transactions

Multi-document transactions that containread operations must use read preference primary. Alloperations in a given transaction must route to the same member.

Maximize Consistency

To avoid stale reads, use primary read preference and"majority" readConcern. If the primary isunavailable, e.g. during elections or when a majority of the replicaset is not accessible, read operations using primary readpreference produce an error or throw an exception.

In some circumstances, it may be possible for a replica set totemporarily have two primaries; however, only one primary will becapable of confirming writes with the "majority" writeconcern.

  • A partial network partition may segregate a primary (Pold) into a partition with a minority of the nodes, while theother side of the partition contains a majority of nodes. Thepartition with the majority will elect a new primary (Pnew), but for a brief period, the old primary (Pold) may still continue to serve reads and writes, as it hasnot yet detected that it can only see a minority of nodes in thereplica set. During this period, if the old primary (Pold) is still visible to clients as a primary, reads from thisprimary may reflect stale data.
  • A primary (Pold) may become unresponsive, which willtrigger an election and a new primary (Pnew) can beelected, serving reads and writes. If the unresponsive primary(Pold) starts responding again, two primaries will bevisible for a brief period. The brief period will end when Pold steps down. However, during the brief period, clientsmight read from the old primary Pold, which can providestale data.

To increase consistency, you can disable automatic failover;however, disabling automatic failover sacrifices availability.

Maximize Availability

To permit read operations when possible, useprimaryPreferred. When there’s a primary you will getconsistent reads [3], but if there is no primaryyou can still query secondaries. However, whenusing this read mode, consider the situation described insecondary vs secondaryPreferred.

Minimize Latency

To always read from a low-latency node, use nearest. Thedriver or mongos will read from the nearest member andthose no more than 15 milliseconds [1]further away than the nearest member.

nearest does not guarantee consistency. If the nearestmember to your application server is a secondary with some replicationlag, queries could return stale data. nearest onlyreflects network distance and does not reflect I/O or CPU load.

[1]This threshold is configurable. SeelocalPingThresholdMs for mongos or your driverdocumentation for the appropriate setting.

Query From Geographically Distributed Members

If the members of a replica set are geographically distributed, youcan create replica tags based that reflect the location of the instance andthen configure your application to query the members nearby.

For example, if members in “east” and “west” data centers aretagged {'dc': 'east'} and{'dc': 'west'}, your application servers in the east data center can readfrom nearby members with the following read preference:

  1. db.collection.find().readPref('nearest', [ { 'dc': 'east' } ])

Although nearest already favors members with low network latency,including the tag makes the choice more predictable.

secondary vs secondaryPreferred

For specific dedicated queries (e.g. ETL, reporting), you may shift theread load from the primary by using the secondary readpreference mode. For this use case, the secondary mode ispreferable to the secondaryPreferred mode becausesecondaryPreferred risks the following situation: if allsecondaries are unavailable and your replica set has enough arbiters[2] to prevent the primary from stepping down,then the primary will receive all traffic from the clients. If theprimary is unable to handle this load, the queries will compete withthe writes. For this reason, use read preference secondary todistribute these specific dedicated queries instead ofsecondaryPreferred.

[2]In general, avoid deploying more than one arbiter per replica set.

Additional Considerations

For aggregation pipelineoperations, you must run on the primary if the pipeline includes eitherthe $out stage or the $merge stage.

For mapReduce operations, only “inline”mapReduce operations that do not write data support readpreference. Otherwise, mapReduce operations must run onthe primary members.

[3](1, 2) In some circumstances, two nodes in a replica setmay transiently believe that they are the primary, but at most, oneof them will be able to complete writes with { w:"majority" } write concern. The node that can complete{ w: "majority" } writes is the currentprimary, and the other node is a former primary that has not yetrecognized its demotion, typically due to a network partition.When this occurs, clients that connect to the former primary mayobserve stale data despite having requested read preferenceprimary, and new writes to the former primary willeventually roll back.