Aggregate

An aggregate is a document of type core.aggregate containing attributes:

  • input-type - string containing name of an input type, i.e. persisted collection type
  • output-type - string containing name of an output type, i.e. created aggregated collection type, AGT
  • parameters - (optional) object containing default parameters used in aggregation pipeline
  • pipeline - array of a pipeline stages forming aggregation pipeline which takes input type documents and produces output type documents

An aggregate introduces virtual (non-persisted) collection into the system based on a series of transformations applied to a persisted collection.

Stages

First stage in the pipeline receives input-type documents as its input documents. A stage performs transformation using specified operators and produces output documents. Second and every following stage receives output documents of a previous stage as its input documents. Last stage output documents are used as documents for output-type collection.

Pipeline stages have the following syntax: { "<STAGE_NAME>": <STAGE_EXPRESSION> }.

Supported stages:

Stage Description
group combines documents by the given expression and produces accumulated fields
filter takes documents which pass the given condition and drops the rest
sample takes specified number of documents randomly and drops the rest
sort arranges documents by the given fields
skip drops leading specified number of documents and takes the rest
limit takes leading specified number of documents and drops the rest
unwind takes the given array field and for each item in the array creates one document
project creates fields using expressions and existing document fields
output defines document attributes and relationships

Group

Syntax: { "group": { "id": <EXPRESSION>, <FIELD_NAME_1>: { <ACCUMULATOR> : <EXPRESSION> } ... } }.

Optional property id defines criteria by which groups are formed. When not defined operator performs accumulation against one group containing all documents.

Other properties are treated as accumulator fields.

Accumulation operators:

Operator Description
$avg calculates average of numeric values in a group
$first takes first document value in a group
$last takes last document value in a group
$max takes highest value in a group
$min takes lowest value in a group
$push creates array of values in a group
$addToSet creates array of distinct values in a group
$stdDevPop calculates population standard deviation of values in a group
$stdDevSamp calculates sample standard deviation of values in a group
$sum calculates addition of values in a group
$mergeObjects takes object document values and merges them into one new object

Accumulator expression should return value needed by the operator. In most cases field path expression are used for accumulator expressions. Valid field path is a value of type string starting with $ followed by property names joined by . (dot), e.g. $order.customer.name.

Example:

Previous stage returns product review documents where each contains an attribute score and a relationship product. To create statistics per product containing number of reviews, average, minimum and maximum score, following expression can be used:

{
  "group": {
    "id": "$product",
    "count": { "$sum": 1 },
    "average": { "$avg": "$score" },
    "minimum": { "$min": "$score" },
    "maximum": { "$max": "$score" }
  }
}

Filter

Syntax: { "filter": { <COMPARISON_EXPRESSION> } }.

Filter stage supports boolean operators and comparison operators. When expression result is true, document is kept in the pipeline, otherwise is omitted.

Example:

Previous stage returns documents containing attribute score. Taking only documents which have value of attribute score greater than 300 can be achieved using expression:

{
  "filter": {
    "score": { "$gt": 300 }
  }
}

Sample

Syntax: { "sample": { "size": <SIZE> } }.

Property size must be a pozitive integer which defines number of documents to take.

Sample stage will always take randomly documents when pipeline is executed. However, due to internal caching mechanisms, when previous stage returns documents that have not changed, sample stage can also return the same documents and not randomly take new ones.

Example:

Previous stage returns 1000 documents. To take at random 50 documents, following expression can be used:

{
  "sample": {
    "size": 50
  }
}

Sort

Syntax: { "sort": <ATTRIBUTE_NAMES> }, where ATTRIBUTE_NAMES is a comma-separated list of attributes to sort by. Default sort order is ascending. Prefix attribute name with - (minus) to sort in descending order.

Example:

Previous stage returns documents with attributes name and rank. To sort by attribute name in ascending order and then by attribute rank in descending order, following expression can be used:

{
  "sort": "name,-rank"
}

Limit

Syntax: { "limit": <LIMIT> }.

Limit stage expects pozitive integer value which specifies maximum number of leading documents to take.

Example:

Previous stage returns sorted documents where only first 100 are needed. To drop the rest, following expression can be used:

{
  "limit": 100
}

Unwind

Shorthand syntax: { "unwind": <FIELD_PATH_EXPRESSION> }.

Unwind stage expects field path expression pointing to an array field. For each item in the array, stage will produce one document containing all fields including the ID. All fields will have the same value except array field which will no longer be array, but just one item from the array.

Full syntax:

{
  "unwind": {
    "path": <FIELD_PATH_EXPRESSION>,
    "includeArrayIndex": <FIELD_NAME>,
    "preserveNullAndEmptyArrays": <BOOLEAN>
  }
}

Parameters:

  • path - field path expression pointing to an array to unwind
  • (optional) includeArrayIndex - adds an index of an item in the unwinded array to the document under specified field name
  • (optional) preserveNullAndEmptyArrays - by default unwinding null or an empty array would not produce a single document. Setting this parameter to true whould make unwind produce a single document for null or empty array

Note

Stage can create documents with the same identifiers. Pipeline has to ensure each document has an unique ID in the output collection. Therefore, unwind stage should not be the last transformation stage.

Example using shorthand syntax

A previous stage returns following 2 documents:

[
  { "id": 1, "items": [ "a", "b" ] },
  { "id": 2, "items": [ 100, 200, 300 ] }
]

Applying unwind operation to a field items, like:

{
  "unwind": "$items"
}

will produce following 5 documents:

[
  { "id": 1, "items": "a" },
  { "id": 1, "items": "b" },
  { "id": 2, "items": 100 },
  { "id": 2, "items": 200 },
  { "id": 2, "items": 300 }
]

Example using full syntax

A previous stage returns following 5 documents:

[
  { "id": 1, "items": [ "a", "b" ] },
  { "id": 2, "items": [ "x" ] }
  { "id": 3, "items": [] },
  { "id": 4, "items": null },
  { "id": 5 }
]

Applying unwind operation to a field items including unwind index and preserving null and empty arrays, like:

{
  "unwind": {
    "path": "$items",
    "includeArrayIndex": "unwindIndex",
    "preserveNullAndEmptyArrays": true
  }
}

will produce following 6 documents:

[
  { "id": 1, "items": "a", "unwindIndex": 0 },
  { "id": 1, "items": "b", "unwindIndex": 1 },
  { "id": 2, "items": "x", "unwindIndex": 0 }
  { "id": 3, "unwindIndex": null },
  { "id": 4, "items": null, "unwindIndex": null },
  { "id": 5, "unwindIndex": null }
]

Project

Syntax: { "project": { <FIELD_NAME_1>: <EXPRESSION> ... } }.

Project stage tranforms a set of document fields into a new set of fields by keeping or removing existing fields and calculating new fields from the existing ones. Stage expects object where property names are field names, while property values are expressions resolving field value.

Expression can be a simple field path expression, e.g. $product.category.name, expression from expressions or a complex accumulator expression.

Accumulator expression in project stage comes in two flavors:

Syntax: { <ACCUMULATOR>: <EXPRESSION> }, where expression should resolve to an array of numeric values. Accumulator will take all numeric values from array and perform the operation.

Syntax: { <ACCUMULATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }, where expressions should resolve to a numeric value. Accumulator will process given array ignoring non-numeric values and perform the operation.

Accumulation operators:

Operator Description
$avg calculates average of numeric values in an array
$max takes highest value in an array
$min takes lowest value in an array
$stdDevPop calculates population standard deviation of values in an array
$stdDevSamp calculates sample standard deviation of values in an array
$sum calculates addition of values in an array

Example:

A previous stage returns following 5 documents:

[
  { "id": 1, "prices": [ 100, 200 ] },
  { "id": 2, "prices": [ 100 ] }
  { "id": 3, "prices": [ 500, 1000 ] },
  { "id": 4, "prices": [] },
  { "id": 5 }
]

Applying various accumulator operators to a prices field like:

{
  "project": {
    "average": { "$avg": "$prices" },
    "minimum": { "$min": "$prices" },
    "maximum": { "$max": "$prices" },
    "total": { "$sum": "$prices" }
  }
}

produces following 5 documents:

[
  { "id": 1, "average": 100.0, "minimum": 100, "maximum": 200, "total": 300 },
  { "id": 2, "average": 100.0, "minimum": 100, "maximum": 100, "total": 100 },
  { "id": 3, "average": 750.0, "minimum": 500, "maximum": 1000, "total": 1500 },
  { "id": 4, "average": null, "minimum": null, "maximum": null, "total": 0 },
  { "id": 5, "average": null, "minimum": null, "maximum": null, "total": 0 }
]

Output

Syntax: { "output": { "attributes": <ATTRIBUTES>, "relationships": <RELATIONSHIPS> } }.

Output stage expects object containing properties attributes and/or relationships. Both should have a value of type array containg field names. Output stage must be the last stage in the pipeline. It defines how document should be presented on the API endpoint. Pipelines without output stage will use all document fields as attributes.

Example:

Previous stage returns documents with numeric fields count, average, minimum and maximum which should be attributes in the API response. Documents also contains field product which is of type object containing properties type and id which should be a relationship in the API response. Following output stage defines attributes and relationships:

{
  "output": {
    "attributes": [ "count", "average", "minimum", "maximum" ],
    "relationships": [ "product" ]
  }
}

Expressions

Expression operators can accept one or many arguments.

Syntax for single argument operator: { <OPERATOR>: <EXPRESSION> }.

Syntax for multiple argument operator: { <OPERATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }.

Arithmetic operators

Depending on an arithemtic operation, arithmetic operator can accept one or many arguments.

Arithmetic operators:

Operator Accepts Returns
$abs one numeric value absolute value of a number
$add any number of numeric values addition of a given numeric values. Operator can accept single date value, treating other numeric values as miliseconds to add to a date producing new date as a result.
$ceil one numeric value least integer greater than or equal to a given number
$divide two numeric values division of a first by second number
$exp one numeric value natural exponent for the given value
$floor one numeric value greatest integer less than or equal to a given number
$ln one non-negative numeric value natural logarithm for the given value
$log two numeric values logarithm of a first using second as a base
$log10 one non-negative numeric value logarithm of the given value using value 10 as a base
$mod two numeric values remainder of a division of a first by second number
$multiply any number of numeric values multiplication of a given numeric values
$pow two numeric values exponentiation using first as a base and second as a exponent
$sqrt one non-negative numeric value square root of the given numeric value
$subtract two numeric values/dates or date and a numeric value subtraction of second from the first. When both values are numbers it returns numeric difference. When both values are dates it returns difference in miliseconds. When first is date and second is a number it returns date using second to substract first in miliseconds.
$trunc one numeric value truncation of the given numeric value to an integer value

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "price": 100.0, "vat": 1.17, "discount": 0.8 },
  { "id": 2, "price": 120.0, "vat": 1.17, "discount": 0.95 }
]

Calculating total price value multiplying base price by VAT and discount like:

{
  "project": {
    "total": { "$multiply": [ "$price", "$vat", "$discount" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "total": 93.6 },
  { "id": 2, "total": 133.38 }
]

Array operators

Array operators perform transformation operations based on one or many arrays.

Array operators:

Operator Accepts Returns
$arrayElemAt one array and one integer item from the given array using second argument as an item index in the array. When index is positive counting is performed from the start of an array and when negative from the end of an array
$concatArrays any number of arrays one new array containing items from all given arrays
$in one value and one array boolean value indicating whether first argument value is contained inside the given array
$range two or three integers sequence of integer values using first argument as initial value up to but not including second argument value where third argument is used as incrementing value which defaults to 1 when not specified
$reverseArray one array new array with the given array items in the reversed order
$size one array total item count
$slice one array and one or two integers subset of the given array taking second argument as starting item index (defaults to 0) and third as number of items to take. When second argument is non-negative counting goes from start of the array and when negative from end of the array.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Merging accepted and rejected fields into one field responded can be done using $concatArrays operator like:

{
  "project": {
    "responded": { "$concatArrays": [ "$accepted", "$rejected" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "responded": [ "Alice", "John", "Ann" ] },
  { "id": 2, "responded": [ "Bob", "Mark" ] }
]

Boolean operators

Boolean operators combine one or many logical expressions.

Boolean operators:

Operator Accepts Returns
$and any number of expressions true if all arguments are truthful, otherwise false
$not one expression true if argument is not truthful, otherwise false
$or any number of expressions true if any argument is truthful, otherwise false

Example:

A previous stage returns following 5 documents:

[
  { "id": 1, "person": "Alice", "responded": [ "Alice", "John" ] },
  { "id": 1, "person": "John", "responded": [ "Alice", "John" ] },
  { "id": 1, "person": "Ann", "responded": [ "Alice", "John" ] },
  { "id": 2, "person": "Bob", "responded": [ "Mark" ] },
  { "id": 2, "person": "Mark", "responded": [ "Mark" ] },
]

Previous stage unwraped invited people into a person field while keeping people who responded in a responded field. To keep a person field and introduce waitingForResponse field, following project stage can be used:

{
  "project": {
    "person": "$person",
    "waitingForResponse": { "$not": { "$in": [ "$person", "$responded" ] } }
  }
}

Stage will produce following 5 documents:

[
  { "id": 1, "person": "Alice", "waitingForResponse": false },
  { "id": 1, "person": "John", "waitingForResponse": false },
  { "id": 1, "person": "Ann", "waitingForResponse": true },
  { "id": 2, "person": "Bob", "waitingForResponse": true },
  { "id": 2, "person": "Mark", "waitingForResponse": false },
]

Comparison operators

Comparison operators are mainly used in filter stage to produce logical expressions. However, in project stage can be also used to store expression value in a field.

Filter stage expects document field to be specified against comparison operator using syntax: { <FIELD_NAME>: { <OPERATOR>: <EXPRESSION> }.

Comparison operators available in filter stage:

Operator Accepts Returns
$eq one expression true when field value is equivalent to expression result, otherwise false
$gt one expression true when field value is greater than expression result, otherwise false
$gte one expression true when field value is greater than or equivalent to expression result, otherwise false
$in any number of values true if field values is equivalent to any of the given value, otherwise false
$lt one expression true when field value is less than expression result, otherwise false
$lte one expression true when field value is less than or equivalent to expression result, otherwise false
$ne one expression false when field value is equivalent to expression result, otherwise true
$nin any number of values false if field values is equivalent to any of the given value, otherwise true

Example:

A previous stage returns following 3 documents:

[
  { "id": 1, "rejectedCount": 0 },
  { "id": 2, "rejectedCount": 0 },
  { "id": 3, "rejectedCount": 1 }
]

Take documents where rejectedCount is 1 or greater, like:

{
  "filter": {
    "rejectedCount": { "$gte": 1 }
  }
}

Filter stage produces following document:

[
  { "id": 3, "rejectedCount": 1 }
]

Comparison operators available in project stage:

Operator Accepts Returns
$cmp two expressions 1 when first expression result is greater than second, -1 when first expression result is less than second, otherwise 0 (they are equivalent)
$eq two expressions true when expression results are equivalent, otherwise false
$gt two expressions true when first expression result is greater than second, otherwise false
$gte two expressions true when first expression result is greater than or equivalent to second, otherwise false
$lt two expressions true when first expression result is less than second, otherwise false
$lte two expressions true when first expression result is less than or equivalent to second, otherwise false
$ne two expressions false when expression results are equivalent, otherwise true

Example:

A previous stage returns following 3 documents:

[
  { "id": 1, "rejectedCount": 0 },
  { "id": 2, "rejectedCount": 0 },
  { "id": 3, "rejectedCount": 1 }
]

Introduce field hasRejected containing boolean value indicating whether field rejectedCount is 1 or greater, like:

{
  "project": {
    "hasRejected": { "$gte": [ "$rejectedCount", 1 ] }
  }
}

Project stage produces following 3 documents:

[
  { "id": 1, "hasRejected": false },
  { "id": 2, "hasRejected": false },
  { "id": 3, "hasRejected": true }
]

Conditional operators

Conditional operators execute accepted expression when certain condition is satisfied.

Conditional operators:

Operator Accepts Returns
$cond three expressions first expression should resolve to a boolean value and if true, second expression is evaluated and returned, otherwise third expression is evaluated and returned
$ifNull two expressions first expression result when result is not null, otherwise returns second expression result

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Adding field message with value Has rejected when rejected array contains one or more items, otherwise use value None rejected, like:

{
  "project": {
    "message": {
      "$cond": [
        {
          "$gte": [ { "$size": "$rejected" }, 1 ]
        },
        "Has rejected",
        "None rejected"
      ]
    }
  }
}

will produce following 2 documents:

[
  { "id": 1, "message": "Has rejected" },
  { "id": 2, "message": "None rejected" }
]

Literal operators

Literal operators supress evaluation of containing expressions.

Literal operators:

Operator Accepts Returns
$literal one expression expression as a value without evaluating it

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Field path expression in acceptedEvaluated and acceptedLiteral where latter is wrapped with literal operator, like:

{
  "project": {
    "acceptedEvaluated": "$accepted",
    "acceptedLiteral": { "$literal": "$accepted" }
  }
}

will produce following 2 documents:

[
  { "id": 1, "acceptedEvaluated": [ "Alice", "John" ], "acceptedLiteral": "$accepted" },
  { "id": 2, "acceptedEvaluated": [ "Bob", "Mark" ], "acceptedLiteral": "$accepted" }
]

Object operators

Object operators perform transformation operations based on one or many objects.

Object operators:

Operator Accepts Returns
$mergeObjects any number of objects new object containing all keys/values from the given objects. Merging is done from left to right, meaning when two or more objects have the same key, values of an object which is last in argument list will be taken.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "f1": { "a": 1 }, "f2": { "b": 2 } },
  { "id": 2, "f1": { "a": 2 }, "f2": { "b": 3 } }
]

Merge objects under fields f1 and f2 as new field r, like:

{
  "project": {
    "r": { "$mergeObjects": [ "$f1", "$f2" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "r": { "a": 1, "b": 2 } },
  { "id": 2, "r": { "a": 2, "b": 3 } }
]

Set operators

Set operations perform transformation operations based on one or many sets, i.e. arrays with distinct items.

Set operators:

Operator Accepts Returns
$allElementsTrue one array true when all values in the array are truthful, otherwise false
$anyElementTrue one array true when some value in the array is truthful, otherwise false
$setDifference two arrays new array with items from the first array which are not contained in the second
$setEquals any number of arrays true when arrays have same distinct items, otherwise false
$setIntersection any number of arrays new array with items from the first array which are contained in all other arrays
$setIsSubset two arrays true when first array items are contained in the second, otherwise false
$setUnion any number of arrays new array containing distinct items from all arrays

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "a": [ 1, 2, 3 ], "b": [ 4, 5 ] },
  { "id": 2, "a": [ 1, 2, 3 ], "b": [ 2, 3, 4 ] }
]

Creating a field all containing distinct items from fields a and b, like:

{
  "project": {
    "all": { "$setUnion": [ "$a", "$b" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "all": [ 1, 2, 3, 4, 5 ] },
  { "id": 2, "all": [ 1, 2, 3, 4 ] }
]

String operators

String operators perform transformation operations based on one or many values of type string.

String operators:

Operator Accepts Returns
$concat any number of strings new string by joining all given strings. If any argument resolves to null, operation returns null.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "firstName": "John", "lastName": "Smith" },
  { "id": 2, "firstName": "John", "lastName": "Doe" }
]

Concatenating fields firstName and lastName to form field fullName, like:

{
  "project": {
    "fullName": { "$concat": [ "$firstName", " ", "$lastName" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "fullName": "John Smith" },
  { "id": 2, "fullName": "John Doe" }
]

Type operators

Type operators are able to accept any data type.

Operator Accepts Returns
$toObjectId one expression creates object ID of the given value. If value is null, operation returns null.
$toString one expression string representation of the given value. If value is null, operation returns null.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "processed": 5, "total": 20 },
  { "id": 2, "processed": 3, "total": 10 }
]

String operator $concat accepts only string values. In order to create string value based on integer fields, operator $toString can be used:

{
  "project": {
    "message": {
      "$concat": [
        "Processed ",
        { "$toString": "$processed" },
        " out of ",
        { "$toString": "$total" } ]
    }
  }
}

Project stage will produce following 2 documents:

[
  { "id": 1, "message": "Processed 5 out of 20" },
  { "id": 2, "message": "Processed 3 out of 10" }
]