Multikey Index Bounds

The bounds of an index scan define the portions of an index to searchduring a query. When multiple predicates over an index exist, MongoDBwill attempt to combine the bounds for these predicates, by eitherintersection or compounding, in order to produce a scan withsmaller bounds.

Intersect Bounds for Multikey Index

Bounds intersection refers to a logical conjunction (i.e. AND) ofmultiple bounds. For instance, given two bounds [ [ 3, Infinity ] ]and [ [ -Infinity, 6 ] ], the intersection of the bounds results in[ [ 3, 6 ] ].

Given an indexed array field, consider aquery that specifies multiple predicates on the array and can use amultikey index. MongoDB can intersectmultikey index bounds if an$elemMatch joins the predicates.

For example, a collection survey contains documents with a fielditem and an array field ratings:

  1. { _id: 1, item: "ABC", ratings: [ 2, 9 ] }
  2. { _id: 2, item: "XYZ", ratings: [ 4, 3 ] }

Create a multikey index on the ratingsarray:

  1. db.survey.createIndex( { ratings: 1 } )

The following query uses $elemMatch to require that the arraycontains at least one single element that matches both conditions:

  1. db.survey.find( { ratings : { $elemMatch: { $gte: 3, $lte: 6 } } } )

Taking the predicates separately:

  • the bounds for the greater than or equal to 3 predicate (i.e. $gte:3) are [ [ 3, Infinity ] ];
  • the bounds for the less than or equal to 6 predicate (i.e. $lte:6) are [ [ -Infinity, 6 ] ].

Because the query uses $elemMatch to join these predicates,MongoDB can intersect the bounds to:

  1. ratings: [ [ 3, 6 ] ]

If the query does not join the conditions on the array field with$elemMatch, MongoDB cannot intersect the multikey indexbounds. Consider the following query:

  1. db.survey.find( { ratings : { $gte: 3, $lte: 6 } } )

The query searches the ratings array for at least one elementgreater than or equal to 3 and at least one element less than or equalto 6. Because a single element does not need to meet both criteria,MongoDB does not intersect the bounds and uses either [ [ 3,Infinity ] ] or [ [ -Infinity, 6 ] ]. MongoDB makes no guaranteeas to which of these two bounds it chooses.

Compound Bounds for Multikey Index

Compounding bounds refers to using bounds for multiple keys ofcompound index. For instance, given acompound index { a: 1, b: 1 } with bounds on field a of [ [3, Infinity ] ] and bounds on field b of [ [ -Infinity, 6 ]], compounding the bounds results in the use of both bounds:

  1. { a: [ [ 3, Infinity ] ], b: [ [ -Infinity, 6 ] ] }

If MongoDB cannot compound the two bounds, MongoDB always constrainsthe index scan by the bound on its leading field, in this case, a:[ [ 3, Infinity ] ].

Compound Index on an Array Field

Consider a compound multikey index; i.e. a compound index where one of the indexed fields is an array. Forexample, a collection survey contains documents with a fielditem and an array field ratings:

  1. { _id: 1, item: "ABC", ratings: [ 2, 9 ] }
  2. { _id: 2, item: "XYZ", ratings: [ 4, 3 ] }

Create a compound index on the itemfield and the ratings field:

  1. db.survey.createIndex( { item: 1, ratings: 1 } )

The following query specifies a condition on both keys of the index:

  1. db.survey.find( { item: "XYZ", ratings: { $gte: 3 } } )

Taking the predicates separately:

  • the bounds for the item: "XYZ" predicate are [ [ "XYZ", "XYZ" ] ];
  • the bounds for the ratings: { $gte: 3 } predicate are [ [ 3,Infinity ] ].

MongoDB can compound the two bounds to use the combined bounds of:

  1. { item: [ [ "XYZ", "XYZ" ] ], ratings: [ [ 3, Infinity ] ] }

Range Queries on a Scalar Indexed Field (WiredTiger)

Changed in version 3.4: For the WiredTiger and In-Memory storage engines only,

Starting in MongoDB 3.4, for multikey indexes created using MongoDB3.4 or later, MongoDB keeps track of which indexed field or fields cause anindex to be a multikey index. Tracking this information allows theMongoDB query engine to use tighter index bounds.

The aforementioned compound index is onthe scalar field [1] item and the array field ratings:

  1. db.survey.createIndex( { item: 1, ratings: 1 } )

For the WiredTiger and the In-Memory storage engines, if a queryoperation specifies multiple predicates on the indexed scalar field(s)of a compound multikey index created in MongoDB 3.4 or later, MongoDBwill intersect the bounds for the field.

For example, the following operation specifies a range query on thescalar field as well as a range query on the array field:

  1. db.survey.find( {
  2. item: { $gte: "L", $lte: "Z"}, ratings : { $elemMatch: { $gte: 3, $lte: 6 } }
  3. } )

MongoDB will intersect the bounds for item to [ [ "L", "Z" ] ]and ratings to [[3.0, 6.0]] to use the combined bounds of:

  1. "item" : [ [ "L", "Z" ] ], "ratings" : [ [3.0, 6.0] ]

For another example, consider where the scalar fields belong to a nested document.For instance, a collection survey contains the following documents:

  1. { _id: 1, item: { name: "ABC", manufactured: 2016 }, ratings: [ 2, 9 ] }
  2. { _id: 2, item: { name: "XYZ", manufactured: 2013 }, ratings: [ 4, 3 ] }

Create a compound multikey index on the scalar fields "item.name","item.manufactured", and the array field ratings :

  1. db.survey.createIndex( { "item.name": 1, "item.manufactured": 1, ratings: 1 } )

Consider the following operation that specifies query predicates on thescalar fields:

  1. db.survey.find( {
  2. "item.name": "L" ,
  3. "item.manufactured": 2012
  4. } )

For this query, MongoDB can use the combined bounds of:

  1. "item.name" : [ ["L", "L"] ], "item.manufactured" : [ [2012.0, 2012.0] ]

Earlier versions of MongoDB cannot combine these bounds forthe scalar fields.

[1]A scalar field is a field whose value is neither a documentnor an array; e.g. a field whose value is a string or aninteger is a scalar field.A scalar field can be a field nested in a document, as long as thefield itself is not an array or a document. For example, in thedocument { a: { b: { c: 5, d: 5 } } }, c and d arescalar fields where as a and b are not.

Compound Index on Fields from an Array of Embedded Documents

If an array contains embedded documents, to index on fields containedin the embedded documents, use the dotted field name in the index specification. For instance,given the following array of embedded documents:

  1. ratings: [ { score: 2, by: "mn" }, { score: 9, by: "anon" } ]

The dotted field name for the score field is "ratings.score".

Compound Bounds of Non-array Field and Field from an Array

Consider a collection survey2 contains documents with a fielditem and an array field ratings:

  1. {
  2. _id: 1,
  3. item: "ABC",
  4. ratings: [ { score: 2, by: "mn" }, { score: 9, by: "anon" } ]
  5. }
  6. {
  7. _id: 2,
  8. item: "XYZ",
  9. ratings: [ { score: 5, by: "anon" }, { score: 7, by: "wv" } ]
  10. }

Create a compound index on the non-arrayfield item as well as two fields from an array ratings.score andratings.by:

  1. db.survey2.createIndex( { "item": 1, "ratings.score": 1, "ratings.by": 1 } )

The following query specifies a condition on all three fields:

  1. db.survey2.find( { item: "XYZ", "ratings.score": { $lte: 5 }, "ratings.by": "anon" } )

Taking the predicates separately:

  • the bounds for the item: "XYZ" predicate are [ [ "XYZ", "XYZ" ] ];
  • the bounds for the score: { $lte: 5 } predicate are [ [ -Infinity, 5] ];
  • the bounds for the by: "anon" predicate are [ "anon", "anon" ].

MongoDB can compound the bounds for the item key with either thebounds for "ratings.score" or the bounds for "ratings.by",depending upon the query predicates and the index key values. MongoDBmakes no guarantee as to which bounds it compounds with the itemfield. For instance, MongoDB will either choose to compound theitem bounds with the "ratings.score" bounds:

  1. {
  2. "item" : [ [ "XYZ", "XYZ" ] ],
  3. "ratings.score" : [ [ -Infinity, 5 ] ],
  4. "ratings.by" : [ [ MinKey, MaxKey ] ]
  5. }

Or, MongoDB may choose to compound the item bounds with"ratings.by" bounds:

  1. {
  2. "item" : [ [ "XYZ", "XYZ" ] ],
  3. "ratings.score" : [ [ MinKey, MaxKey ] ],
  4. "ratings.by" : [ [ "anon", "anon" ] ]
  5. }

However, to compound the bounds for "ratings.score" with the boundsfor "ratings.by", the query must use $elemMatch. SeeCompound Bounds of Index Fields from an Array for more information.

Compound Bounds of Index Fields from an Array

To compound together the bounds for index keys from the same array:

  • the index keys must share the same field path up to but excluding thefield names, and
  • the query must specify predicates on the fields using$elemMatch on that path.

For a field in an embedded document, the dotted field name, such as "a.b.c.d", is the field path ford. To compound the bounds for index keys from the same array, the$elemMatch must be on the path up to but excluding the fieldname itself; i.e. "a.b.c".

For instance, create a compound index onthe ratings.score and the ratings.by fields:

  1. db.survey2.createIndex( { "ratings.score": 1, "ratings.by": 1 } )

The fields "ratings.score" and "ratings.by" share the fieldpath ratings. The following query uses $elemMatch on thefield ratings to require that the array contains at least onesingle element that matches both conditions:

  1. db.survey2.find( { ratings: { $elemMatch: { score: { $lte: 5 }, by: "anon" } } } )

Taking the predicates separately:

  • the bounds for the score: { $lte: 5 } predicate is [ -Infinity, 5];
  • the bounds for the by: "anon" predicate is [ "anon", "anon" ].

MongoDB can compound the two bounds to use the combined bounds of:

  1. { "ratings.score" : [ [ -Infinity, 5 ] ], "ratings.by" : [ [ "anon", "anon" ] ] }

Query Without $elemMatch

If the query does not join the conditions on the indexed array fieldswith $elemMatch, MongoDB cannot compound their bounds.Consider the following query:

  1. db.survey2.find( { "ratings.score": { $lte: 5 }, "ratings.by": "anon" } )

Because a single embedded document in the array does not need to meetboth criteria, MongoDB does not compound the bounds. When using acompound index, if MongoDB cannot constrain all the fields of theindex, MongoDB always constrains the leading field of the index, inthis case "ratings.score":

  1. {
  2. "ratings.score": [ [ -Infinity, 5 ] ],
  3. "ratings.by": [ [ MinKey, MaxKey ] ]
  4. }

$elemMatch on Incomplete Path

If the query does not specify $elemMatch on the path of theembedded fields, up to but excluding the field names, MongoDBcannot compound the bounds of index keys from the same array.

For example, a collection survey3 contains documents with a fielditem and an array field ratings:

  1. {
  2. _id: 1,
  3. item: "ABC",
  4. ratings: [ { scores: [ { q1: 2, q2: 4 }, { q1: 3, q2: 8 } ], loc: "A" },
  5. { scores: [ { q1: 2, q2: 5 } ], loc: "B" } ]
  6. }
  7. {
  8. _id: 2,
  9. item: "XYZ",
  10. ratings: [ { scores: [ { q1: 7 }, { q1: 2, q2: 8 } ], loc: "B" } ]
  11. }

Create a compound index on theratings.scores.q1 and the ratings.scores.q2 fields:

  1. db.survey3.createIndex( { "ratings.scores.q1": 1, "ratings.scores.q2": 1 } )

The fields "ratings.scores.q1" and "ratings.scores.q2" share thefield path "ratings.scores" and the $elemMatch must be onthat path.

The following query, however, uses an $elemMatch but not onthe required path:

  1. db.survey3.find( { ratings: { $elemMatch: { 'scores.q1': 2, 'scores.q2': 8 } } } )

As such, MongoDB cannot compound the bounds, and the"ratings.scores.q2" field will be unconstrained during the indexscan. To compound the bounds, the query must use $elemMatch onthe path "ratings.scores":

  1. db.survey3.find( { 'ratings.scores': { $elemMatch: { 'q1': 2, 'q2': 8 } } } )