$indexOfBytes (aggregation)

Definition

  • $indexOfBytes

New in version 3.4.

Searches a string for an occurence of a substring and returns theUTF-8 byte index (zero-based) of the first occurence. If thesubstring is not found, returns -1.

$indexOfBytes has the following operatorexpression syntax:

  1. { $indexOfBytes: [ <string expression>, <substring expression>, <start>, <end> ] }

OperandDescription<string expression>Can be any valid expression as long as it resolves to astring. For more information on expressions, seeExpressions.

If the string expression resolves to a value of null orrefers to a field that is missing, $indexOfBytes returns null.

If the string expression does not resolve to a string or null norrefers to a missing field, $indexOfBytes returns an error.<substring expression>Can be any valid expression as long as it resolves to astring. For more information on expressions, seeExpressions.<start>Optional An integral number that specifies the starting indexposition for the search. Can be any validexpression that resolves toa non-negative integral number.<end>Optional An integral number that specifies the ending indexposition for the search. Can be any validexpression that resolves toa non-negative integral number. If you specify a <end> indexvalue, you should also specify a <start> index value;otherwise, $indexOfBytes uses the <end>value as the <start> index value instead of the <end>value.

Behavior

  • If <string expression> is null, $indexOfBytes returns null.
  • If $indexOfBytes is called on a field that doesn’t exist in the document, $indexOfBytes returns null.
  • If <string expression> is not a string and not null, $indexOfBytes returns an error.
  • If <substring expression> is null, $indexOfBytes returns an error.
  • If <start> or <end> is a negative number, $indexOfBytes returns an error.
  • If <start> is a number greater than <end>, $indexOfBytes returns -1.
  • If <start> is a number greater than the byte length of the string, $indexOfBytes returns -1.
  • If <start> or <end> is given a value that is not an integer, $indexOfBytes returns an error.
  • If the <substring expression> is found multiple times within the <string expression>, then $indexOfBytes returns the index of the first <substring expression> found.

Some short examples to highlight different behavior:

ExampleResults
{ $indexOfBytes: [ "cafeteria", "e" ] }3
{ $indexOfBytes: [ "cafétéria", "é" ] }3
{ $indexOfBytes: [ "cafétéria", "e" ] }-1
{ $indexOfBytes: [ "cafétéria", "t" ] }5
{ $indexOfBytes: [ "foo.bar.fi", ".", 5 ] }7
{ $indexOfBytes: [ "vanilla", "ll", 0, 2 ] }-1
{ $indexOfBytes: [ "vanilla", "ll", -1 ] }-1
{ $indexOfBytes: [ "vanilla", "ll", 12 ] }-1
{ $indexOfBytes: [ "vanilla", "ll", 5, 2 ] }-1
{ $indexOfBytes: [ "vanilla", "nilla", 3 ] }-1
{ $indexOfBytes: [ null, "foo" ] }null

Examples

Consider an inventory collection with the following documents:

  1. { "_id" : 1, "item" : "foo" }
  2. { "_id" : 2, "item" : "fóofoo" }
  3. { "_id" : 3, "item" : "the foo bar" }
  4. { "_id" : 4, "item" : "hello world fóo" }
  5. { "_id" : 5, "item" : null }
  6. { "_id" : 6, "amount" : 3 }

The following operation uses the $indexOfBytes operator toretrieve the indexes at which the string foo is located in each item:

  1. db.inventory.aggregate(
  2. [
  3. {
  4. $project:
  5. {
  6. byteLocation: { $indexOfBytes: [ "$item", "foo" ] },
  7. }
  8. }
  9. ]
  10. )

The operation returns the following results:

  1. { "_id" : 1, "byteLocation" : "0" }
  2. { "_id" : 2, "byteLocation" : "4" }
  3. { "_id" : 3, "byteLocation" : "4" }
  4. { "_id" : 4, "byteLocation" : "-1" }
  5. { "_id" : 5, "byteLocation" : null }
  6. { "_id" : 6, "byteLocation" : null }

See also

$indexOfCP and $indexOfArray