Model Tree Structures with Materialized Paths

Overview

This page describes a data model that describes a tree-likestructure in MongoDB documents by storing full relationship pathsbetween documents.

Pattern

The Materialized Paths pattern stores each tree node in a document;in addition to the tree node, document stores as a string the id(s) ofthe node’s ancestors or path. Although the Materialized Paths patternrequires additional steps of working with strings and regularexpressions, the pattern also provides more flexibility in working withthe path, such as finding nodes by partial paths.

Consider the following hierarchy of categories:

Tree data model for a sample hierarchy of categories.

The following example models the tree using Materialized Paths,storing the path in the field path; the path string uses the comma, as a delimiter:

  1. db.categories.insert( { _id: "Books", path: null } )
  2. db.categories.insert( { _id: "Programming", path: ",Books," } )
  3. db.categories.insert( { _id: "Databases", path: ",Books,Programming," } )
  4. db.categories.insert( { _id: "Languages", path: ",Books,Programming," } )
  5. db.categories.insert( { _id: "MongoDB", path: ",Books,Programming,Databases," } )
  6. db.categories.insert( { _id: "dbm", path: ",Books,Programming,Databases," } )
  • You can query to retrieve the whole tree, sorting by the fieldpath:
  1. db.categories.find().sort( { path: 1 } )
  • You can use regular expressions on the path field to find thedescendants of Programming:
  1. db.categories.find( { path: /,Programming,/ } )
  • You can also retrieve the descendants of Books where theBooks is also at the topmost level of the hierarchy:
  1. db.categories.find( { path: /^,Books,/ } )
  • To create an index on the field path use the followinginvocation:
  1. db.categories.createIndex( { path: 1 } )

This index may improve performance depending on the query:

  • For queries from the root Books sub-tree (e.g. /^,Books,/or /^,Books,Programming,/), an index on the path fieldimproves the query performance significantly.

  • For queries of sub-trees where the path from the root is notprovided in the query (e.g. /,Databases,/), or similar queriesof sub-trees, where the node might be in the middle of the indexedstring, the query must inspect the entire index.

For these queries an index may provide some performanceimprovement if the index is significantly smaller than theentire collection.