$group (aggregation)

Definition

  • $group
  • Groups input documents by the specified _id expression and foreach distinct grouping, outputs a document. The _id field of eachoutput document contains the unique group by value. The outputdocuments can also contain computed fields that hold the values ofsome accumulator expression.

Note

$group does not order its output documents.

The $group stage has the following prototype form:

  1. {
  2. $group:
  3. {
  4. _id: <expression>, // Group By Expression
  5. <field1>: { <accumulator1> : <expression1> },
  6. ...
  7. }
  8. }

FieldDescriptionid_Required. If you specify an id value of null, or anyother constant value, the $group stage calculatesaccumulated values for all the input documents as a whole.See example of Group by Null.field_Optional. Computed using theaccumulator operators.

The _id and the accumulator operatorscan accept any valid expression. For more information onexpressions, see Expressions.

Considerations

Accumulator Operator

The <accumulator> operator must be one of the following accumulatoroperators:

NameDescription
$addToSetReturns an array of unique expression values for eachgroup. Order of the array elements is undefined.
$avgReturns an average of numerical values. Ignores non-numeric values.
$firstReturns a value from the first document for each group. Orderis only defined if the documents are in a defined order.
$lastReturns a value from the last document for each group. Orderis only defined if the documents are in a defined order.
$maxReturns the highest expression value for each group.
$mergeObjectsReturns a document created by combining the input documentsfor each group.
$minReturns the lowest expression value for each group.
$pushReturns an array of expression values for each group.
$stdDevPopReturns the population standard deviation of the input values.
$stdDevSampReturns the sample standard deviation of the input values.
$sumReturns a sum of numerical values. Ignores non-numeric values.

$group Operator and Memory

The $group stage has a limit of 100 megabytes of RAM. Bydefault, if the stage exceeds this limit, $group returns anerror. To allow for the handling of large datasets, set theallowDiskUse option totrue. This flag enables $group operations to write totemporary files. For more information, see thedb.collection.aggregate() method and theaggregate command.

Examples

Count the Number of Documents in a Collection

From the mongo shell, create a sample collection namedsales with the following documents:

  1. db.sales.insertMany([
  2. { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  3. { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  4. { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  5. { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  6. { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  7. { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  8. { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  9. { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
  10. ])

The following aggregation operation uses the $group stageto count the number of documents in the sales collection:

  1. db.sales.aggregate( [
  2. {
  3. $group: {
  4. _id: null,
  5. count: { $sum: 1 }
  6. }
  7. }
  8. ] )

The operation returns the following result:

  1. { "_id" : null, "count" : 8 }

This aggregation operation is equivalent to the following SQL statement:

  1. SELECT COUNT(*) AS count FROM sales

See also

Retrieve Distinct Values

The following aggregation operation uses the $group stageto retrieve the distinct item values from the sales collection:

  1. db.sales.aggregate( [ { $group : { _id : "$item" } } ] )

The operation returns the following result:

  1. { "_id" : "abc" }
  2. { "_id" : "jkl" }
  3. { "_id" : "def" }
  4. { "_id" : "xyz" }

Group by Item Having

The following aggregation operation groups documents by the itemfield, calculating the total sale amount per item and returning onlythe items with total sale amount greater than or equal to 100:

  1. db.sales.aggregate(
  2. [
  3. // First Stage
  4. {
  5. $group :
  6. {
  7. _id : "$item",
  8. totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } }
  9. }
  10. },
  11. // Second Stage
  12. {
  13. $match: { "totalSaleAmount": { $gte: 100 } }
  14. }
  15. ]
  16. )
  • First Stage:
  • The $group stage groups the documents by item toretrieve the distinct item values. This stage returns thetotalSaleAmount for each item.
  • Second Stage:
  • The $match stage filters the resulting documents to onlyreturn items with a totalSaleAmount greater than or equal to 100.

The operation returns the following result:

  1. { "_id" : "abc", "totalSaleAmount" : NumberDecimal("170") }
  2. { "_id" : "xyz", "totalSaleAmount" : NumberDecimal("150") }
  3. { "_id" : "def", "totalSaleAmount" : NumberDecimal("112.5") }

This aggregation operation is equivalent to the following SQL statement:

  1. SELECT item,
  2. Sum(( price * quantity )) AS totalSaleAmount
  3. FROM sales
  4. GROUP BY item
  5. HAVING totalSaleAmount >= 100

See also

Calculate Count, Sum, and Average

From the mongo shell, create a sample collection namedsales with the following documents:

  1. db.sales.insertMany([
  2. { "_id" : 1, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("2"), "date" : ISODate("2014-03-01T08:00:00Z") },
  3. { "_id" : 2, "item" : "jkl", "price" : NumberDecimal("20"), "quantity" : NumberInt("1"), "date" : ISODate("2014-03-01T09:00:00Z") },
  4. { "_id" : 3, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt( "10"), "date" : ISODate("2014-03-15T09:00:00Z") },
  5. { "_id" : 4, "item" : "xyz", "price" : NumberDecimal("5"), "quantity" : NumberInt("20") , "date" : ISODate("2014-04-04T11:21:39.736Z") },
  6. { "_id" : 5, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("10") , "date" : ISODate("2014-04-04T21:23:13.331Z") },
  7. { "_id" : 6, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("5" ) , "date" : ISODate("2015-06-04T05:08:13Z") },
  8. { "_id" : 7, "item" : "def", "price" : NumberDecimal("7.5"), "quantity": NumberInt("10") , "date" : ISODate("2015-09-10T08:43:00Z") },
  9. { "_id" : 8, "item" : "abc", "price" : NumberDecimal("10"), "quantity" : NumberInt("5" ) , "date" : ISODate("2016-02-06T20:20:13Z") },
  10. ])

Group by Day of the Year

The following pipeline calculates the total sales amount, average salesquantity, and sale count for each day in the year 2014:

  1. db.sales.aggregate([
  2. // First Stage
  3. {
  4. $match : { "date": { $gte: new ISODate("2014-01-01"), $lt: new ISODate("2015-01-01") } }
  5. },
  6. // Second Stage
  7. {
  8. $group : {
  9. _id : { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
  10. totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
  11. averageQuantity: { $avg: "$quantity" },
  12. count: { $sum: 1 }
  13. }
  14. },
  15. // Third Stage
  16. {
  17. $sort : { totalSaleAmount: -1 }
  18. }
  19. ])
  • First Stage:
  • The $match stage filters the documents to only passdocuments from the year 2014 to the next stage.
  • Second Stage:
  • The $group stage groups the documents by date andcalculates the total sale amount, average quantity, and total count of thedocuments in each group.
  • Third Stage:
  • The $sort stage sorts the results by the totalsale amount for each group in descending order.

The operation returns the following results:

  1. { "_id" : "2014-04-04", "totalSaleAmount" : NumberDecimal("200"), "averageQuantity" : 15, "count" : 2 }
  2. { "_id" : "2014-03-15", "totalSaleAmount" : NumberDecimal("50"), "averageQuantity" : 10, "count" : 1 }
  3. { "_id" : "2014-03-01", "totalSaleAmount" : NumberDecimal("40"), "averageQuantity" : 1.5, "count" : 2 }

This aggregation operation is equivalent to the following SQL statement:

  1. SELECT date,
  2. Sum(( price * quantity )) AS totalSaleAmount,
  3. Avg(quantity) AS averageQuantity,
  4. Count(*) AS Count
  5. FROM sales
  6. GROUP BY Date(date)
  7. ORDER BY totalSaleAmount DESC

See also

Group by null

The following aggregation operation specifies a group id ofnull, calculating the total sale amount, average quantity, and count of_all documents in the collection.

  1. db.sales.aggregate([
  2. {
  3. $group : {
  4. _id : null,
  5. totalSaleAmount: { $sum: { $multiply: [ "$price", "$quantity" ] } },
  6. averageQuantity: { $avg: "$quantity" },
  7. count: { $sum: 1 }
  8. }
  9. }
  10. ])

The operation returns the following result:

  1. {
  2. "_id" : null,
  3. "totalSaleAmount" : NumberDecimal("452.5"),
  4. "averageQuantity" : 7.875,
  5. "count" : 8
  6. }

This aggregation operation is equivalent to the following SQL statement:

  1. SELECT Sum(price * quantity) AS totalSaleAmount,
  2. Avg(quantity) AS averageQuantity,
  3. Count(*) AS Count
  4. FROM sales

See also

Pivot Data

From the mongo shell, create a sample collection namedbooks with the following documents:

  1. db.books.insertMany([
  2. { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  3. { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  4. { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 },
  5. { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  6. { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
  7. ])

Group title by author

The following aggregation operation pivots the data in the bookscollection to have titles grouped by authors.

  1. db.books.aggregate([
  2. { $group : { _id : "$author", books: { $push: "$title" } } }
  3. ])

The operation returns the following documents:

  1. { "_id" : "Homer", "books" : [ "The Odyssey", "Iliad" ] }
  2. { "_id" : "Dante", "books" : [ "The Banquet", "Divine Comedy", "Eclogues" ] }

Group Documents by author

The following aggregation operation groups documents by author:

  1. db.books.aggregate([
  2. // First Stage
  3. {
  4. $group : { _id : "$author", books: { $push: "$$ROOT" } }
  5. },
  6. // Second Stage
  7. {
  8. $addFields:
  9. {
  10. totalCopies : { $sum: "$books.copies" }
  11. }
  12. }
  13. ])
  • First Stage:
  • $group uses the $$ROOTsystem variable to group the entire documents by authors. This stagepasses the following documents to the next stage:
  1. { "_id" : "Homer",
  2. "books" :
  3. [
  4. { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  5. { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
  6. ]
  7. },
  8. { "_id" : "Dante",
  9. "books" :
  10. [
  11. { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  12. { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  13. { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
  14. ]
  15. }
  • Second Stage:
  • $addFields adds a field to the output containingthe total copies of books for each author.

Note

The resulting documents must not exceed theBSON Document Size limit of 16 megabytes.

The operation returns the following documents:

  1. {
  2. "_id" : "Homer",
  3. "books" :
  4. [
  5. { "_id" : 7000, "title" : "The Odyssey", "author" : "Homer", "copies" : 10 },
  6. { "_id" : 7020, "title" : "Iliad", "author" : "Homer", "copies" : 10 }
  7. ],
  8. "totalCopies" : 20
  9. }
  10.  
  11. {
  12. "_id" : "Dante",
  13. "books" :
  14. [
  15. { "_id" : 8751, "title" : "The Banquet", "author" : "Dante", "copies" : 2 },
  16. { "_id" : 8752, "title" : "Divine Comedy", "author" : "Dante", "copies" : 1 },
  17. { "_id" : 8645, "title" : "Eclogues", "author" : "Dante", "copies" : 2 }
  18. ],
  19. "totalCopies" : 5
  20. }

See also

Additional Resources

The Aggregation with the Zip Code Data Settutorial provides an extensive example of the $groupoperator in a common use case.