Aggregate¶
An aggregate is a document of type core.aggregate containing attributes:
- input-type - string containing name of an input type, i.e. persisted collection type
- output-type - string containing name of an output type, i.e. created aggregated collection type, AGT
- parameters - (optional) object containing default parameters used in aggregation pipeline
- pipeline - array of a pipeline stages forming aggregation pipeline which takes input type documents and produces output type documents
An aggregate introduces virtual (non-persisted) collection into the system based on a series of transformations applied to a persisted collection.
Stages¶
First stage in the pipeline receives input-type documents as its input documents. A stage performs transformation using specified operators and produces output documents. Second and every following stage receives output documents of a previous stage as its input documents. Last stage output documents are used as documents for output-type collection.
Pipeline stages have the following syntax: { "<STAGE_NAME>": <STAGE_EXPRESSION> }
.
Supported stages:
Stage | Description |
---|---|
group | combines documents by the given expression and produces accumulated fields |
filter | takes documents which pass the given condition and drops the rest |
sample | takes specified number of documents randomly and drops the rest |
sort | arranges documents by the given fields |
skip | drops leading specified number of documents and takes the rest |
limit | takes leading specified number of documents and drops the rest |
unwind | takes the given array field and for each item in the array creates one document |
project | creates fields using expressions and existing document fields |
output | defines document attributes and relationships |
Group¶
Syntax: { "group": { "id": <EXPRESSION>, <FIELD_NAME_1>: { <ACCUMULATOR> : <EXPRESSION> } ... } }
.
Optional property id defines criteria by which groups are formed. When not defined operator performs accumulation against one group containing all documents.
Other properties are treated as accumulator fields.
Accumulation operators:
Operator | Description |
---|---|
$avg | calculates average of numeric values in a group |
$first | takes first document value in a group |
$last | takes last document value in a group |
$max | takes highest value in a group |
$min | takes lowest value in a group |
$push | creates array of values in a group |
$addToSet | creates array of distinct values in a group |
$stdDevPop | calculates population standard deviation of values in a group |
$stdDevSamp | calculates sample standard deviation of values in a group |
$sum | calculates addition of values in a group |
$mergeObjects | takes object document values and merges them into one new object |
Accumulator expression should return value needed by the operator. In most cases field path expression are used for accumulator expressions. Valid field path is a value of type string starting with $ followed by property names joined by . (dot), e.g. $order.customer.name.
Example:
Previous stage returns product review documents where each contains an attribute score and a relationship product. To create statistics per product containing number of reviews, average, minimum and maximum score, following expression can be used:
{
"group": {
"id": "$product",
"count": { "$sum": 1 },
"average": { "$avg": "$score" },
"minimum": { "$min": "$score" },
"maximum": { "$max": "$score" }
}
}
Filter¶
Syntax: { "filter": { <COMPARISON_EXPRESSION> } }
.
Filter stage supports boolean operators and comparison operators. When expression result is true, document is kept in the pipeline, otherwise is omitted.
Example:
Previous stage returns documents containing attribute score. Taking only documents which have value of attribute score greater than 300 can be achieved using expression:
{
"filter": {
"score": { "$gt": 300 }
}
}
Sample¶
Syntax: { "sample": { "size": <SIZE> } }
.
Property size must be a pozitive integer which defines number of documents to take.
Sample stage will always take randomly documents when pipeline is executed. However, due to internal caching mechanisms, when previous stage returns documents that have not changed, sample stage can also return the same documents and not randomly take new ones.
Example:
Previous stage returns 1000 documents. To take at random 50 documents, following expression can be used:
{
"sample": {
"size": 50
}
}
Sort¶
Syntax: { "sort": <ATTRIBUTE_NAMES> }
, where ATTRIBUTE_NAMES is a comma-separated list of attributes to sort by. Default sort order is ascending. Prefix attribute name with - (minus) to sort in descending order.
Example:
Previous stage returns documents with attributes name and rank. To sort by attribute name in ascending order and then by attribute rank in descending order, following expression can be used:
{
"sort": "name,-rank"
}
Skip¶
Syntax: { "skip": <SKIP> }
.
Skip stage expects pozitive integer value which specifies maximum number of leading documents to drop.
Example:
Previous stage returns sorted documents where first 3 are not needed. To drop them, following expression can be used:
{
"skip": 3
}
Limit¶
Syntax: { "limit": <LIMIT> }
.
Limit stage expects pozitive integer value which specifies maximum number of leading documents to take.
Example:
Previous stage returns sorted documents where only first 100 are needed. To drop the rest, following expression can be used:
{
"limit": 100
}
Unwind¶
Shorthand syntax: { "unwind": <FIELD_PATH_EXPRESSION> }
.
Unwind stage expects field path expression pointing to an array field. For each item in the array, stage will produce one document containing all fields including the ID. All fields will have the same value except array field which will no longer be array, but just one item from the array.
Full syntax:
{
"unwind": {
"path": <FIELD_PATH_EXPRESSION>,
"includeArrayIndex": <FIELD_NAME>,
"preserveNullAndEmptyArrays": <BOOLEAN>
}
}
Parameters:
- path - field path expression pointing to an array to unwind
- (optional) includeArrayIndex - adds an index of an item in the unwinded array to the document under specified field name
- (optional) preserveNullAndEmptyArrays - by default unwinding null or an empty array would not produce a single document. Setting this parameter to true whould make unwind produce a single document for null or empty array
Note
Stage can create documents with the same identifiers. Pipeline has to ensure each document has an unique ID in the output collection. Therefore, unwind stage should not be the last transformation stage.
Example using shorthand syntax¶
A previous stage returns following 2 documents:
[
{ "id": 1, "items": [ "a", "b" ] },
{ "id": 2, "items": [ 100, 200, 300 ] }
]
Applying unwind operation to a field items, like:
{
"unwind": "$items"
}
will produce following 5 documents:
[
{ "id": 1, "items": "a" },
{ "id": 1, "items": "b" },
{ "id": 2, "items": 100 },
{ "id": 2, "items": 200 },
{ "id": 2, "items": 300 }
]
Example using full syntax¶
A previous stage returns following 5 documents:
[
{ "id": 1, "items": [ "a", "b" ] },
{ "id": 2, "items": [ "x" ] }
{ "id": 3, "items": [] },
{ "id": 4, "items": null },
{ "id": 5 }
]
Applying unwind operation to a field items including unwind index and preserving null and empty arrays, like:
{
"unwind": {
"path": "$items",
"includeArrayIndex": "unwindIndex",
"preserveNullAndEmptyArrays": true
}
}
will produce following 6 documents:
[
{ "id": 1, "items": "a", "unwindIndex": 0 },
{ "id": 1, "items": "b", "unwindIndex": 1 },
{ "id": 2, "items": "x", "unwindIndex": 0 }
{ "id": 3, "unwindIndex": null },
{ "id": 4, "items": null, "unwindIndex": null },
{ "id": 5, "unwindIndex": null }
]
Project¶
Syntax: { "project": { <FIELD_NAME_1>: <EXPRESSION> ... } }
.
Project stage tranforms a set of document fields into a new set of fields by keeping or removing existing fields and calculating new fields from the existing ones. Stage expects object where property names are field names, while property values are expressions resolving field value.
Expression can be a simple field path expression, e.g. $product.category.name, expression from expressions or a complex accumulator expression.
Accumulator expression in project stage comes in two flavors:
Syntax: { <ACCUMULATOR>: <EXPRESSION> }
, where expression should resolve to an array of numeric values. Accumulator will take all numeric values from array and perform the operation.
Syntax: { <ACCUMULATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }
, where expressions should resolve to a numeric value. Accumulator will process given array ignoring non-numeric values and perform the operation.
Accumulation operators:
Operator | Description |
---|---|
$avg | calculates average of numeric values in an array |
$max | takes highest value in an array |
$min | takes lowest value in an array |
$stdDevPop | calculates population standard deviation of values in an array |
$stdDevSamp | calculates sample standard deviation of values in an array |
$sum | calculates addition of values in an array |
Example:
A previous stage returns following 5 documents:
[
{ "id": 1, "prices": [ 100, 200 ] },
{ "id": 2, "prices": [ 100 ] }
{ "id": 3, "prices": [ 500, 1000 ] },
{ "id": 4, "prices": [] },
{ "id": 5 }
]
Applying various accumulator operators to a prices field like:
{
"project": {
"average": { "$avg": "$prices" },
"minimum": { "$min": "$prices" },
"maximum": { "$max": "$prices" },
"total": { "$sum": "$prices" }
}
}
produces following 5 documents:
[
{ "id": 1, "average": 100.0, "minimum": 100, "maximum": 200, "total": 300 },
{ "id": 2, "average": 100.0, "minimum": 100, "maximum": 100, "total": 100 },
{ "id": 3, "average": 750.0, "minimum": 500, "maximum": 1000, "total": 1500 },
{ "id": 4, "average": null, "minimum": null, "maximum": null, "total": 0 },
{ "id": 5, "average": null, "minimum": null, "maximum": null, "total": 0 }
]
Output¶
Syntax: { "output": { "attributes": <ATTRIBUTES>, "relationships": <RELATIONSHIPS> } }
.
Output stage expects object containing properties attributes and/or relationships. Both should have a value of type array containg field names. Output stage must be the last stage in the pipeline. It defines how document should be presented on the API endpoint. Pipelines without output stage will use all document fields as attributes.
Example:
Previous stage returns documents with numeric fields count, average, minimum and maximum which should be attributes in the API response. Documents also contains field product which is of type object containing properties type and id which should be a relationship in the API response. Following output stage defines attributes and relationships:
{
"output": {
"attributes": [ "count", "average", "minimum", "maximum" ],
"relationships": [ "product" ]
}
}
Expressions¶
Expression operators can accept one or many arguments.
Syntax for single argument operator: { <OPERATOR>: <EXPRESSION> }
.
Syntax for multiple argument operator: { <OPERATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }
.
Arithmetic operators¶
Depending on an arithemtic operation, arithmetic operator can accept one or many arguments.
Arithmetic operators:
Operator | Accepts | Returns |
---|---|---|
$abs | one numeric value | absolute value of a number |
$add | any number of numeric values | addition of a given numeric values. Operator can accept single date value, treating other numeric values as miliseconds to add to a date producing new date as a result. |
$ceil | one numeric value | least integer greater than or equal to a given number |
$divide | two numeric values | division of a first by second number |
$exp | one numeric value | natural exponent for the given value |
$floor | one numeric value | greatest integer less than or equal to a given number |
$ln | one non-negative numeric value | natural logarithm for the given value |
$log | two numeric values | logarithm of a first using second as a base |
$log10 | one non-negative numeric value | logarithm of the given value using value 10 as a base |
$mod | two numeric values | remainder of a division of a first by second number |
$multiply | any number of numeric values | multiplication of a given numeric values |
$pow | two numeric values | exponentiation using first as a base and second as a exponent |
$sqrt | one non-negative numeric value | square root of the given numeric value |
$subtract | two numeric values/dates or date and a numeric value | subtraction of second from the first. When both values are numbers it returns numeric difference. When both values are dates it returns difference in miliseconds. When first is date and second is a number it returns date using second to substract first in miliseconds. |
$trunc | one numeric value | truncation of the given numeric value to an integer value |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "price": 100.0, "vat": 1.17, "discount": 0.8 },
{ "id": 2, "price": 120.0, "vat": 1.17, "discount": 0.95 }
]
Calculating total price value multiplying base price by VAT and discount like:
{
"project": {
"total": { "$multiply": [ "$price", "$vat", "$discount" ] }
}
}
will produce following 2 documents:
[
{ "id": 1, "total": 93.6 },
{ "id": 2, "total": 133.38 }
]
Array operators¶
Array operators perform transformation operations based on one or many arrays.
Array operators:
Operator | Accepts | Returns |
---|---|---|
$arrayElemAt | one array and one integer | item from the given array using second argument as an item index in the array. When index is positive counting is performed from the start of an array and when negative from the end of an array |
$concatArrays | any number of arrays | one new array containing items from all given arrays |
$in | one value and one array | boolean value indicating whether first argument value is contained inside the given array |
$range | two or three integers | sequence of integer values using first argument as initial value up to but not including second argument value where third argument is used as incrementing value which defaults to 1 when not specified |
$reverseArray | one array | new array with the given array items in the reversed order |
$size | one array | total item count |
$slice | one array and one or two integers | subset of the given array taking second argument as starting item index (defaults to 0) and third as number of items to take. When second argument is non-negative counting goes from start of the array and when negative from end of the array. |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
{ "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]
Merging accepted and rejected fields into one field responded can be done using $concatArrays operator like:
{
"project": {
"responded": { "$concatArrays": [ "$accepted", "$rejected" ] }
}
}
will produce following 2 documents:
[
{ "id": 1, "responded": [ "Alice", "John", "Ann" ] },
{ "id": 2, "responded": [ "Bob", "Mark" ] }
]
Boolean operators¶
Boolean operators combine one or many logical expressions.
Boolean operators:
Operator | Accepts | Returns |
---|---|---|
$and | any number of expressions | true if all arguments are truthful, otherwise false |
$not | one expression | true if argument is not truthful, otherwise false |
$or | any number of expressions | true if any argument is truthful, otherwise false |
Example:
A previous stage returns following 5 documents:
[
{ "id": 1, "person": "Alice", "responded": [ "Alice", "John" ] },
{ "id": 1, "person": "John", "responded": [ "Alice", "John" ] },
{ "id": 1, "person": "Ann", "responded": [ "Alice", "John" ] },
{ "id": 2, "person": "Bob", "responded": [ "Mark" ] },
{ "id": 2, "person": "Mark", "responded": [ "Mark" ] },
]
Previous stage unwraped invited people into a person field while keeping people who responded in a responded field. To keep a person field and introduce waitingForResponse field, following project stage can be used:
{
"project": {
"person": "$person",
"waitingForResponse": { "$not": { "$in": [ "$person", "$responded" ] } }
}
}
Stage will produce following 5 documents:
[
{ "id": 1, "person": "Alice", "waitingForResponse": false },
{ "id": 1, "person": "John", "waitingForResponse": false },
{ "id": 1, "person": "Ann", "waitingForResponse": true },
{ "id": 2, "person": "Bob", "waitingForResponse": true },
{ "id": 2, "person": "Mark", "waitingForResponse": false },
]
Comparison operators¶
Comparison operators are mainly used in filter stage to produce logical expressions. However, in project stage can be also used to store expression value in a field.
Filter stage expects document field to be specified against comparison operator using syntax: { <FIELD_NAME>: { <OPERATOR>: <EXPRESSION> }
.
Comparison operators available in filter stage:
Operator | Accepts | Returns |
---|---|---|
$eq | one expression | true when field value is equivalent to expression result, otherwise false |
$gt | one expression | true when field value is greater than expression result, otherwise false |
$gte | one expression | true when field value is greater than or equivalent to expression result, otherwise false |
$in | any number of values | true if field values is equivalent to any of the given value, otherwise false |
$lt | one expression | true when field value is less than expression result, otherwise false |
$lte | one expression | true when field value is less than or equivalent to expression result, otherwise false |
$ne | one expression | false when field value is equivalent to expression result, otherwise true |
$nin | any number of values | false if field values is equivalent to any of the given value, otherwise true |
Example:
A previous stage returns following 3 documents:
[
{ "id": 1, "rejectedCount": 0 },
{ "id": 2, "rejectedCount": 0 },
{ "id": 3, "rejectedCount": 1 }
]
Take documents where rejectedCount is 1 or greater, like:
{
"filter": {
"rejectedCount": { "$gte": 1 }
}
}
Filter stage produces following document:
[
{ "id": 3, "rejectedCount": 1 }
]
Comparison operators available in project stage:
Operator | Accepts | Returns |
---|---|---|
$cmp | two expressions | 1 when first expression result is greater than second, -1 when first expression result is less than second, otherwise 0 (they are equivalent) |
$eq | two expressions | true when expression results are equivalent, otherwise false |
$gt | two expressions | true when first expression result is greater than second, otherwise false |
$gte | two expressions | true when first expression result is greater than or equivalent to second, otherwise false |
$lt | two expressions | true when first expression result is less than second, otherwise false |
$lte | two expressions | true when first expression result is less than or equivalent to second, otherwise false |
$ne | two expressions | false when expression results are equivalent, otherwise true |
Example:
A previous stage returns following 3 documents:
[
{ "id": 1, "rejectedCount": 0 },
{ "id": 2, "rejectedCount": 0 },
{ "id": 3, "rejectedCount": 1 }
]
Introduce field hasRejected containing boolean value indicating whether field rejectedCount is 1 or greater, like:
{
"project": {
"hasRejected": { "$gte": [ "$rejectedCount", 1 ] }
}
}
Project stage produces following 3 documents:
[
{ "id": 1, "hasRejected": false },
{ "id": 2, "hasRejected": false },
{ "id": 3, "hasRejected": true }
]
Conditional operators¶
Conditional operators execute accepted expression when certain condition is satisfied.
Conditional operators:
Operator | Accepts | Returns |
---|---|---|
$cond | three expressions | first expression should resolve to a boolean value and if true, second expression is evaluated and returned, otherwise third expression is evaluated and returned |
$ifNull | two expressions | first expression result when result is not null, otherwise returns second expression result |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
{ "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]
Adding field message with value Has rejected when rejected array contains one or more items, otherwise use value None rejected, like:
{
"project": {
"message": {
"$cond": [
{
"$gte": [ { "$size": "$rejected" }, 1 ]
},
"Has rejected",
"None rejected"
]
}
}
}
will produce following 2 documents:
[
{ "id": 1, "message": "Has rejected" },
{ "id": 2, "message": "None rejected" }
]
Literal operators¶
Literal operators supress evaluation of containing expressions.
Literal operators:
Operator | Accepts | Returns |
---|---|---|
$literal | one expression | expression as a value without evaluating it |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
{ "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]
Field path expression in acceptedEvaluated and acceptedLiteral where latter is wrapped with literal operator, like:
{
"project": {
"acceptedEvaluated": "$accepted",
"acceptedLiteral": { "$literal": "$accepted" }
}
}
will produce following 2 documents:
[
{ "id": 1, "acceptedEvaluated": [ "Alice", "John" ], "acceptedLiteral": "$accepted" },
{ "id": 2, "acceptedEvaluated": [ "Bob", "Mark" ], "acceptedLiteral": "$accepted" }
]
Object operators¶
Object operators perform transformation operations based on one or many objects.
Object operators:
Operator | Accepts | Returns |
---|---|---|
$mergeObjects | any number of objects | new object containing all keys/values from the given objects. Merging is done from left to right, meaning when two or more objects have the same key, values of an object which is last in argument list will be taken. |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "f1": { "a": 1 }, "f2": { "b": 2 } },
{ "id": 2, "f1": { "a": 2 }, "f2": { "b": 3 } }
]
Merge objects under fields f1 and f2 as new field r, like:
{
"project": {
"r": { "$mergeObjects": [ "$f1", "$f2" ] }
}
}
will produce following 2 documents:
[
{ "id": 1, "r": { "a": 1, "b": 2 } },
{ "id": 2, "r": { "a": 2, "b": 3 } }
]
Set operators¶
Set operations perform transformation operations based on one or many sets, i.e. arrays with distinct items.
Set operators:
Operator | Accepts | Returns |
---|---|---|
$allElementsTrue | one array | true when all values in the array are truthful, otherwise false |
$anyElementTrue | one array | true when some value in the array is truthful, otherwise false |
$setDifference | two arrays | new array with items from the first array which are not contained in the second |
$setEquals | any number of arrays | true when arrays have same distinct items, otherwise false |
$setIntersection | any number of arrays | new array with items from the first array which are contained in all other arrays |
$setIsSubset | two arrays | true when first array items are contained in the second, otherwise false |
$setUnion | any number of arrays | new array containing distinct items from all arrays |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "a": [ 1, 2, 3 ], "b": [ 4, 5 ] },
{ "id": 2, "a": [ 1, 2, 3 ], "b": [ 2, 3, 4 ] }
]
Creating a field all containing distinct items from fields a and b, like:
{
"project": {
"all": { "$setUnion": [ "$a", "$b" ] }
}
}
will produce following 2 documents:
[
{ "id": 1, "all": [ 1, 2, 3, 4, 5 ] },
{ "id": 2, "all": [ 1, 2, 3, 4 ] }
]
String operators¶
String operators perform transformation operations based on one or many values of type string.
String operators:
Operator | Accepts | Returns |
---|---|---|
$concat | any number of strings | new string by joining all given strings. If any argument resolves to null, operation returns null. |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "firstName": "John", "lastName": "Smith" },
{ "id": 2, "firstName": "John", "lastName": "Doe" }
]
Concatenating fields firstName and lastName to form field fullName, like:
{
"project": {
"fullName": { "$concat": [ "$firstName", " ", "$lastName" ] }
}
}
will produce following 2 documents:
[
{ "id": 1, "fullName": "John Smith" },
{ "id": 2, "fullName": "John Doe" }
]
Type operators¶
Type operators are able to accept any data type.
Operator | Accepts | Returns |
---|---|---|
$toObjectId | one expression | creates object ID of the given value. If value is null, operation returns null. |
$toString | one expression | string representation of the given value. If value is null, operation returns null. |
Example:
A previous stage returns following 2 documents:
[
{ "id": 1, "processed": 5, "total": 20 },
{ "id": 2, "processed": 3, "total": 10 }
]
String operator $concat accepts only string values. In order to create string value based on integer fields, operator $toString can be used:
{
"project": {
"message": {
"$concat": [
"Processed ",
{ "$toString": "$processed" },
" out of ",
{ "$toString": "$total" } ]
}
}
}
Project stage will produce following 2 documents:
[
{ "id": 1, "message": "Processed 5 out of 20" },
{ "id": 2, "message": "Processed 3 out of 10" }
]