$graphLookup (aggregation)

Changed in version 3.4.

Definition

  • $graphLookup
  • Performs a recursive search on a collection, with options forrestricting the search by recursion depth and query filter.

The $graphLookup search process is summarized below:

  • Input documents flow into the $graphLookup stage of anaggregation operation.

  • $graphLookup targets the search to the collectiondesignated by the from parameter (see below for fulllist of search parameters).

  • For each input document, the search begins with the valuedesignated by startWith.

  • $graphLookup matches the startWith valueagainst the field designated by connectToField in otherdocuments in the from collection.

  • For each matching document, $graphLookup takes the value ofthe connectFromField and checks every document in thefrom collection for a matching connectToField value. Foreach match, $graphLookup adds the matching document in thefrom collection to an array field named by the asparameter.

This step continues recursively until no more matching documentsare found, or until the operation reaches a recursion depthspecified by the maxDepth parameter. $graphLookup thenappends the array field to the input document. $graphLookupreturns results after completing its search on all inputdocuments.

$graphLookup has the following prototype form:

  1. {
  2. $graphLookup: {
  3. from: <collection>,
  4. startWith: <expression>,
  5. connectFromField: <string>,
  6. connectToField: <string>,
  7. as: <string>,
  8. maxDepth: <number>,
  9. depthField: <string>,
  10. restrictSearchWithMatch: <document>
  11. }
  12. }

$graphLookup takes a document with the following fields:

FieldDescriptionfromTarget collection for the $graphLookupoperation to search, recursively matching theconnectFromField to the connectToField.The from collection cannot besharded and must be in the samedatabase as any other collections used in the operation.For information, see Sharded Collections.startWithExpression that specifiesthe value of the connectFromField with which to start therecursive search. Optionally, startWith may be array ofvalues, each of which is individually followed through thetraversal process.connectFromFieldField name whose value $graphLookup uses torecursively match against the connectToField of otherdocuments in the collection. If the value is an array, eachelement is individually followed through thetraversal process.connectToFieldField name in other documents against which to match thevalue of the field specified by the connectFromFieldparameter.asName of the array field added to each output document.Contains the documents traversed in the$graphLookup stage to reach the document.

Note

Documents returned in the as field are not guaranteedto be in any order.

maxDepthOptional. Non-negative integral number specifying themaximum recursion depth.depthFieldOptional. Name of the field to add to each traverseddocument in the search path. The value of this fieldis the recursion depth for the document, represented as aNumberLong. Recursion depthvalue starts at zero, so the first lookup corresponds tozero depth.restrictSearchWithMatchOptional. A document specifying additional conditionsfor the recursive search. The syntax is identical toquery filter syntax.

Note

You cannot use any aggregation expression in this filter. For example, aquery document such as

  1. { lastName: { $ne: "$lastName" } }

will not work in this context to find documents in whichthe lastName value is different from the lastNamevalue of the input document, because "$lastName" willact as a string literal, not a field path.

Considerations

Sharded Collections

The collection specified in from cannot besharded. However, the collection on which you run theaggregate() method can be sharded. That is, inthe following:

  1. db.collection.aggregate([
  2. { $graphLookup: { from: "fromCollection", ... } }
  3. ])
  • The collection can be sharded.
  • The fromCollection cannot be sharded.

To join multiple sharded collections, consider:

  • Modifying client applications to perform manual lookups instead ofusing the $graphLookup aggregation stage.
  • If possible, using an embedded data model that removes the need to join collections.

Max Depth

Setting the maxDepth field to 0 is equivalent to anon-recursive $graphLookup search stage.

Memory

The $graphLookup stage must stay within the 100 megabytememory limit. If allowDiskUse: true is specified for theaggregate() operation, the$graphLookup stage ignores the option. If there are otherstages in the aggregate() operation,allowDiskUse: true option is in effect for these other stages.

See aggregration pipeline limitations for more information.

Views and Collation

If performing an aggregation that involves multiple views, such aswith $lookup or $graphLookup, the views musthave the same collation.

Examples

Within a Single Collection

A collection named employees has the following documents:

  1. { "_id" : 1, "name" : "Dev" }
  2. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
  3. { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
  4. { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
  5. { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }
  6. { "_id" : 6, "name" : "Dan", "reportsTo" : "Andrew" }

The following $graphLookup operation recursively matcheson the reportsTo and name fields in the employeescollection, returning the reporting hierarchy for each person:

  1. db.employees.aggregate( [
  2. {
  3. $graphLookup: {
  4. from: "employees",
  5. startWith: "$reportsTo",
  6. connectFromField: "reportsTo",
  7. connectToField: "name",
  8. as: "reportingHierarchy"
  9. }
  10. }
  11. ] )

The operation returns the following:

  1. {
  2. "_id" : 1,
  3. "name" : "Dev",
  4. "reportingHierarchy" : [ ]
  5. }
  6. {
  7. "_id" : 2,
  8. "name" : "Eliot",
  9. "reportsTo" : "Dev",
  10. "reportingHierarchy" : [
  11. { "_id" : 1, "name" : "Dev" }
  12. ]
  13. }
  14. {
  15. "_id" : 3,
  16. "name" : "Ron",
  17. "reportsTo" : "Eliot",
  18. "reportingHierarchy" : [
  19. { "_id" : 1, "name" : "Dev" },
  20. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
  21. ]
  22. }
  23. {
  24. "_id" : 4,
  25. "name" : "Andrew",
  26. "reportsTo" : "Eliot",
  27. "reportingHierarchy" : [
  28. { "_id" : 1, "name" : "Dev" },
  29. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
  30. ]
  31. }
  32. {
  33. "_id" : 5,
  34. "name" : "Asya",
  35. "reportsTo" : "Ron",
  36. "reportingHierarchy" : [
  37. { "_id" : 1, "name" : "Dev" },
  38. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
  39. { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
  40. ]
  41. }
  42. {
  43. "_id" : 6,
  44. "name" : "Dan",
  45. "reportsTo" : "Andrew",
  46. "reportingHierarchy" : [
  47. { "_id" : 1, "name" : "Dev" },
  48. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" },
  49. { "_id" : 4, "name" : "Andrew", "reportsTo" : "Eliot" }
  50. ]
  51. }

The following table provides a traversal path for thedocument { "_id" : 5, "name" : "Asya", "reportsTo" : "Ron" }:

Start valueThe reportsTo value of the document:
  1. { "reportsTo" : "Ron" }
Depth 0
  1. { "_id" : 3, "name" : "Ron", "reportsTo" : "Eliot" }
Depth 1
  1. { "_id" : 2, "name" : "Eliot", "reportsTo" : "Dev" }
Depth 2
  1. { "_id" : 1, "name" : "Dev" }

The output generates the hierarchyAsya -> Ron -> Eliot -> Dev.

Across Multiple Collections

Like $lookup, $graphLookup can accessanother collection in the same database.

In the following example, a database contains two collections:

  • A collection airports with the following documents:
  1. { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
  2. { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }
  3. { "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
  4. { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }
  5. { "_id" : 4, "airport" : "LHR", "connects" : [ "PWM" ] }
  • A collection travelers with the following documents:
  1. { "_id" : 1, "name" : "Dev", "nearestAirport" : "JFK" }
  2. { "_id" : 2, "name" : "Eliot", "nearestAirport" : "JFK" }
  3. { "_id" : 3, "name" : "Jeff", "nearestAirport" : "BOS" }

For each document in the travelers collection, the followingaggregation operation looks up the nearestAirport value in theairports collection and recursively matches the connectsfield to the airport field. The operation specifies a maximumrecursion depth of 2.

  1. db.travelers.aggregate( [
  2. {
  3. $graphLookup: {
  4. from: "airports",
  5. startWith: "$nearestAirport",
  6. connectFromField: "connects",
  7. connectToField: "airport",
  8. maxDepth: 2,
  9. depthField: "numConnections",
  10. as: "destinations"
  11. }
  12. }
  13. ] )

The operation returns the following results:

  1. {
  2. "_id" : 1,
  3. "name" : "Dev",
  4. "nearestAirport" : "JFK",
  5. "destinations" : [
  6. { "_id" : 3,
  7. "airport" : "PWM",
  8. "connects" : [ "BOS", "LHR" ],
  9. "numConnections" : NumberLong(2) },
  10. { "_id" : 2,
  11. "airport" : "ORD",
  12. "connects" : [ "JFK" ],
  13. "numConnections" : NumberLong(1) },
  14. { "_id" : 1,
  15. "airport" : "BOS",
  16. "connects" : [ "JFK", "PWM" ],
  17. "numConnections" : NumberLong(1) },
  18. { "_id" : 0,
  19. "airport" : "JFK",
  20. "connects" : [ "BOS", "ORD" ],
  21. "numConnections" : NumberLong(0) }
  22. ]
  23. }
  24. {
  25. "_id" : 2,
  26. "name" : "Eliot",
  27. "nearestAirport" : "JFK",
  28. "destinations" : [
  29. { "_id" : 3,
  30. "airport" : "PWM",
  31. "connects" : [ "BOS", "LHR" ],
  32. "numConnections" : NumberLong(2) },
  33. { "_id" : 2,
  34. "airport" : "ORD",
  35. "connects" : [ "JFK" ],
  36. "numConnections" : NumberLong(1) },
  37. { "_id" : 1,
  38. "airport" : "BOS",
  39. "connects" : [ "JFK", "PWM" ],
  40. "numConnections" : NumberLong(1) },
  41. { "_id" : 0,
  42. "airport" : "JFK",
  43. "connects" : [ "BOS", "ORD" ],
  44. "numConnections" : NumberLong(0) } ]
  45. }
  46. {
  47. "_id" : 3,
  48. "name" : "Jeff",
  49. "nearestAirport" : "BOS",
  50. "destinations" : [
  51. { "_id" : 2,
  52. "airport" : "ORD",
  53. "connects" : [ "JFK" ],
  54. "numConnections" : NumberLong(2) },
  55. { "_id" : 3,
  56. "airport" : "PWM",
  57. "connects" : [ "BOS", "LHR" ],
  58. "numConnections" : NumberLong(1) },
  59. { "_id" : 4,
  60. "airport" : "LHR",
  61. "connects" : [ "PWM" ],
  62. "numConnections" : NumberLong(2) },
  63. { "_id" : 0,
  64. "airport" : "JFK",
  65. "connects" : [ "BOS", "ORD" ],
  66. "numConnections" : NumberLong(1) },
  67. { "_id" : 1,
  68. "airport" : "BOS",
  69. "connects" : [ "JFK", "PWM" ],
  70. "numConnections" : NumberLong(0) }
  71. ]
  72. }

The following table provides a traversal path for the recursivesearch, up to depth 2, where the starting airport is JFK:

Start valueThe nearestAirport value from the travelers collection:
  1. { "nearestAirport" : "JFK" }
Depth 0
  1. { "_id" : 0, "airport" : "JFK", "connects" : [ "BOS", "ORD" ] }
Depth 1
  1. { "_id" : 1, "airport" : "BOS", "connects" : [ "JFK", "PWM" ] }{ "_id" : 2, "airport" : "ORD", "connects" : [ "JFK" ] }
Depth 2
  1. { "_id" : 3, "airport" : "PWM", "connects" : [ "BOS", "LHR" ] }

With a Query Filter

The following example uses a collection with a setof documents containing names of people along with arrays of theirfriends and their hobbies. An aggregation operation finds oneparticular person and traverses her network of connections to findpeople who list golf among their hobbies.

A collection named people contains the following documents:

  1. {
  2. "_id" : 1,
  3. "name" : "Tanya Jordan",
  4. "friends" : [ "Shirley Soto", "Terry Hawkins", "Carole Hale" ],
  5. "hobbies" : [ "tennis", "unicycling", "golf" ]
  6. }
  7. {
  8. "_id" : 2,
  9. "name" : "Carole Hale",
  10. "friends" : [ "Joseph Dennis", "Tanya Jordan", "Terry Hawkins" ],
  11. "hobbies" : [ "archery", "golf", "woodworking" ]
  12. }
  13. {
  14. "_id" : 3,
  15. "name" : "Terry Hawkins",
  16. "friends" : [ "Tanya Jordan", "Carole Hale", "Angelo Ward" ],
  17. "hobbies" : [ "knitting", "frisbee" ]
  18. }
  19. {
  20. "_id" : 4,
  21. "name" : "Joseph Dennis",
  22. "friends" : [ "Angelo Ward", "Carole Hale" ],
  23. "hobbies" : [ "tennis", "golf", "topiary" ]
  24. }
  25. {
  26. "_id" : 5,
  27. "name" : "Angelo Ward",
  28. "friends" : [ "Terry Hawkins", "Shirley Soto", "Joseph Dennis" ],
  29. "hobbies" : [ "travel", "ceramics", "golf" ]
  30. }
  31. {
  32. "_id" : 6,
  33. "name" : "Shirley Soto",
  34. "friends" : [ "Angelo Ward", "Tanya Jordan", "Carole Hale" ],
  35. "hobbies" : [ "frisbee", "set theory" ]
  36. }

The following aggregation operation uses three stages:

  • $match matches on documents with a name fieldcontaining the string "Tanya Jordan". Returns one outputdocument.
  • $graphLookup connects the output document’s friendsfield with the name field of other documents in thecollection to traverse Tanya Jordan's network of connections.This stage uses the restrictSearchWithMatch parameter to findonly documents in which the hobbies array contains golf.Returns one output document.
  • $project shapes the output document. The names listed inconnections who play golf are taken from the name field of thedocuments listed in the input document’s golfers array.
  1. db.people.aggregate( [
  2. { $match: { "name": "Tanya Jordan" } },
  3. { $graphLookup: {
  4. from: "people",
  5. startWith: "$friends",
  6. connectFromField: "friends",
  7. connectToField: "name",
  8. as: "golfers",
  9. restrictSearchWithMatch: { "hobbies" : "golf" }
  10. }
  11. },
  12. { $project: {
  13. "name": 1,
  14. "friends": 1,
  15. "connections who play golf": "$golfers.name"
  16. }
  17. }
  18. ] )

The operation returns the following document:

  1. {
  2. "_id" : 1,
  3. "name" : "Tanya Jordan",
  4. "friends" : [
  5. "Shirley Soto",
  6. "Terry Hawkins",
  7. "Carole Hale"
  8. ],
  9. "connections who play golf" : [
  10. "Joseph Dennis",
  11. "Tanya Jordan",
  12. "Angelo Ward",
  13. "Carole Hale"
  14. ]
  15. }

Additional Resource

Webinar: Working with Graph Data in MongoDB