Aggregate¶

An aggregate is a document of type core.aggregate containing attributes:

input-type - string containing name of an input type, i.e. persisted collection type
output-type - string containing name of an output type, i.e. created aggregated collection type, AGT
parameters - (optional) object containing default parameters used in aggregation pipeline
pipeline - array of a pipeline stages forming aggregation pipeline which takes input type documents and produces output type documents

An aggregate introduces virtual (non-persisted) collection into the system based on a series of transformations applied to a persisted collection.

Stages¶

First stage in the pipeline receives input-type documents as its input documents. A stage performs transformation using specified operators and produces output documents. Second and every following stage receives output documents of a previous stage as its input documents. Last stage output documents are used as documents for output-type collection.

Pipeline stages have the following syntax: { "<STAGE_NAME>": <STAGE_EXPRESSION> }.

Supported stages:

Stage	Description
group	combines documents by the given expression and produces accumulated fields
filter	takes documents which pass the given condition and drops the rest
sample	takes specified number of documents randomly and drops the rest
sort	arranges documents by the given fields
skip	drops leading specified number of documents and takes the rest
limit	takes leading specified number of documents and drops the rest
unwind	takes the given array field and for each item in the array creates one document
project	creates fields using expressions and existing document fields
output	defines document attributes and relationships

Group¶

Syntax: { "group": { "id": <EXPRESSION>, <FIELD_NAME_1>: { <ACCUMULATOR> : <EXPRESSION> } ... } }.

Optional property id defines criteria by which groups are formed. When not defined operator performs accumulation against one group containing all documents.

Other properties are treated as accumulator fields.

Accumulation operators:

Operator	Description
$avg	calculates average of numeric values in a group
$first	takes first document value in a group
$last	takes last document value in a group
$max	takes highest value in a group
$min	takes lowest value in a group
$push	creates array of values in a group
$addToSet	creates array of distinct values in a group
$stdDevPop	calculates population standard deviation of values in a group
$stdDevSamp	calculates sample standard deviation of values in a group
$sum	calculates addition of values in a group
$mergeObjects	takes object document values and merges them into one new object

Accumulator expression should return value needed by the operator. In most cases field path expression are used for accumulator expressions. Valid field path is a value of type string starting with $ followed by property names joined by . (dot), e.g. $order.customer.name.

Example:

Previous stage returns product review documents where each contains an attribute score and a relationship product. To create statistics per product containing number of reviews, average, minimum and maximum score, following expression can be used:

{
  "group": {
    "id": "$product",
    "count": { "$sum": 1 },
    "average": { "$avg": "$score" },
    "minimum": { "$min": "$score" },
    "maximum": { "$max": "$score" }
  }
}

Filter¶

Syntax: { "filter": { <COMPARISON_EXPRESSION> } }.

Filter stage supports boolean operators and comparison operators. When expression result is true, document is kept in the pipeline, otherwise is omitted.

Example:

Previous stage returns documents containing attribute score. Taking only documents which have value of attribute score greater than 300 can be achieved using expression:

{
  "filter": {
    "score": { "$gt": 300 }
  }
}

Sample¶

Syntax: { "sample": { "size": <SIZE> } }.

Property size must be a pozitive integer which defines number of documents to take.

Sample stage will always take randomly documents when pipeline is executed. However, due to internal caching mechanisms, when previous stage returns documents that have not changed, sample stage can also return the same documents and not randomly take new ones.

Example:

Previous stage returns 1000 documents. To take at random 50 documents, following expression can be used:

{
  "sample": {
    "size": 50
  }
}

Sort¶

Syntax: { "sort": <ATTRIBUTE_NAMES> }, where ATTRIBUTE_NAMES is a comma-separated list of attributes to sort by. Default sort order is ascending. Prefix attribute name with - (minus) to sort in descending order.

Example:

Previous stage returns documents with attributes name and rank. To sort by attribute name in ascending order and then by attribute rank in descending order, following expression can be used:

{
  "sort": "name,-rank"
}

Skip¶

Syntax: { "skip": <SKIP> }.

Skip stage expects pozitive integer value which specifies maximum number of leading documents to drop.

Example:

Previous stage returns sorted documents where first 3 are not needed. To drop them, following expression can be used:

{
  "skip": 3
}

Limit¶

Syntax: { "limit": <LIMIT> }.

Limit stage expects pozitive integer value which specifies maximum number of leading documents to take.

Example:

Previous stage returns sorted documents where only first 100 are needed. To drop the rest, following expression can be used:

{
  "limit": 100
}

Unwind¶

Shorthand syntax: { "unwind": <FIELD_PATH_EXPRESSION> }.

Unwind stage expects field path expression pointing to an array field. For each item in the array, stage will produce one document containing all fields including the ID. All fields will have the same value except array field which will no longer be array, but just one item from the array.

Full syntax:

{
  "unwind": {
    "path": <FIELD_PATH_EXPRESSION>,
    "includeArrayIndex": <FIELD_NAME>,
    "preserveNullAndEmptyArrays": <BOOLEAN>
  }
}

Parameters:

path - field path expression pointing to an array to unwind
(optional) includeArrayIndex - adds an index of an item in the unwinded array to the document under specified field name
(optional) preserveNullAndEmptyArrays - by default unwinding null or an empty array would not produce a single document. Setting this parameter to true whould make unwind produce a single document for null or empty array

Note

Stage can create documents with the same identifiers. Pipeline has to ensure each document has an unique ID in the output collection. Therefore, unwind stage should not be the last transformation stage.

Example using shorthand syntax¶

A previous stage returns following 2 documents:

[
  { "id": 1, "items": [ "a", "b" ] },
  { "id": 2, "items": [ 100, 200, 300 ] }
]

Applying unwind operation to a field items, like:

{
  "unwind": "$items"
}

will produce following 5 documents:

[
  { "id": 1, "items": "a" },
  { "id": 1, "items": "b" },
  { "id": 2, "items": 100 },
  { "id": 2, "items": 200 },
  { "id": 2, "items": 300 }
]

Example using full syntax¶

A previous stage returns following 5 documents:

[
  { "id": 1, "items": [ "a", "b" ] },
  { "id": 2, "items": [ "x" ] }
  { "id": 3, "items": [] },
  { "id": 4, "items": null },
  { "id": 5 }
]

Applying unwind operation to a field items including unwind index and preserving null and empty arrays, like:

{
  "unwind": {
    "path": "$items",
    "includeArrayIndex": "unwindIndex",
    "preserveNullAndEmptyArrays": true
  }
}

will produce following 6 documents:

[
  { "id": 1, "items": "a", "unwindIndex": 0 },
  { "id": 1, "items": "b", "unwindIndex": 1 },
  { "id": 2, "items": "x", "unwindIndex": 0 }
  { "id": 3, "unwindIndex": null },
  { "id": 4, "items": null, "unwindIndex": null },
  { "id": 5, "unwindIndex": null }
]

Project¶

Syntax: { "project": { <FIELD_NAME_1>: <EXPRESSION> ... } }.

Project stage tranforms a set of document fields into a new set of fields by keeping or removing existing fields and calculating new fields from the existing ones. Stage expects object where property names are field names, while property values are expressions resolving field value.

Expression can be a simple field path expression, e.g. $product.category.name, expression from expressions or a complex accumulator expression.

Accumulator expression in project stage comes in two flavors:

Syntax: { <ACCUMULATOR>: <EXPRESSION> }, where expression should resolve to an array of numeric values. Accumulator will take all numeric values from array and perform the operation.

Syntax: { <ACCUMULATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }, where expressions should resolve to a numeric value. Accumulator will process given array ignoring non-numeric values and perform the operation.

Accumulation operators:

Operator	Description
$avg	calculates average of numeric values in an array
$max	takes highest value in an array
$min	takes lowest value in an array
$stdDevPop	calculates population standard deviation of values in an array
$stdDevSamp	calculates sample standard deviation of values in an array
$sum	calculates addition of values in an array

Example:

A previous stage returns following 5 documents:

[
  { "id": 1, "prices": [ 100, 200 ] },
  { "id": 2, "prices": [ 100 ] }
  { "id": 3, "prices": [ 500, 1000 ] },
  { "id": 4, "prices": [] },
  { "id": 5 }
]

Applying various accumulator operators to a prices field like:

{
  "project": {
    "average": { "$avg": "$prices" },
    "minimum": { "$min": "$prices" },
    "maximum": { "$max": "$prices" },
    "total": { "$sum": "$prices" }
  }
}

produces following 5 documents:

[
  { "id": 1, "average": 100.0, "minimum": 100, "maximum": 200, "total": 300 },
  { "id": 2, "average": 100.0, "minimum": 100, "maximum": 100, "total": 100 },
  { "id": 3, "average": 750.0, "minimum": 500, "maximum": 1000, "total": 1500 },
  { "id": 4, "average": null, "minimum": null, "maximum": null, "total": 0 },
  { "id": 5, "average": null, "minimum": null, "maximum": null, "total": 0 }
]

Output¶

Syntax: { "output": { "attributes": <ATTRIBUTES>, "relationships": <RELATIONSHIPS> } }.

Output stage expects object containing properties attributes and/or relationships. Both should have a value of type array containg field names. Output stage must be the last stage in the pipeline. It defines how document should be presented on the API endpoint. Pipelines without output stage will use all document fields as attributes.

Example:

Previous stage returns documents with numeric fields count, average, minimum and maximum which should be attributes in the API response. Documents also contains field product which is of type object containing properties type and id which should be a relationship in the API response. Following output stage defines attributes and relationships:

{
  "output": {
    "attributes": [ "count", "average", "minimum", "maximum" ],
    "relationships": [ "product" ]
  }
}

Expressions¶

Expression operators can accept one or many arguments.

Syntax for single argument operator: { <OPERATOR>: <EXPRESSION> }.

Syntax for multiple argument operator: { <OPERATOR>: [ <EXPRESSION_1>, <EXPRESSION_2> ... ] }.

Arithmetic operators¶

Depending on an arithemtic operation, arithmetic operator can accept one or many arguments.

Arithmetic operators:

Operator	Accepts	Returns
$abs	one numeric value	absolute value of a number
$add	any number of numeric values	addition of a given numeric values. Operator can accept single date value, treating other numeric values as miliseconds to add to a date producing new date as a result.
$ceil	one numeric value	least integer greater than or equal to a given number
$divide	two numeric values	division of a first by second number
$exp	one numeric value	natural exponent for the given value
$floor	one numeric value	greatest integer less than or equal to a given number
$ln	one non-negative numeric value	natural logarithm for the given value
$log	two numeric values	logarithm of a first using second as a base
$log10	one non-negative numeric value	logarithm of the given value using value 10 as a base
$mod	two numeric values	remainder of a division of a first by second number
$multiply	any number of numeric values	multiplication of a given numeric values
$pow	two numeric values	exponentiation using first as a base and second as a exponent
$sqrt	one non-negative numeric value	square root of the given numeric value
$subtract	two numeric values/dates or date and a numeric value	subtraction of second from the first. When both values are numbers it returns numeric difference. When both values are dates it returns difference in miliseconds. When first is date and second is a number it returns date using second to substract first in miliseconds.
$trunc	one numeric value	truncation of the given numeric value to an integer value

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "price": 100.0, "vat": 1.17, "discount": 0.8 },
  { "id": 2, "price": 120.0, "vat": 1.17, "discount": 0.95 }
]

Calculating total price value multiplying base price by VAT and discount like:

{
  "project": {
    "total": { "$multiply": [ "$price", "$vat", "$discount" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "total": 93.6 },
  { "id": 2, "total": 133.38 }
]

Array operators¶

Array operators perform transformation operations based on one or many arrays.

Array operators:

Operator	Accepts	Returns
$arrayElemAt	one array and one integer	item from the given array using second argument as an item index in the array. When index is positive counting is performed from the start of an array and when negative from the end of an array
$concatArrays	any number of arrays	one new array containing items from all given arrays
$in	one value and one array	boolean value indicating whether first argument value is contained inside the given array
$range	two or three integers	sequence of integer values using first argument as initial value up to but not including second argument value where third argument is used as incrementing value which defaults to 1 when not specified
$reverseArray	one array	new array with the given array items in the reversed order
$size	one array	total item count
$slice	one array and one or two integers	subset of the given array taking second argument as starting item index (defaults to 0) and third as number of items to take. When second argument is non-negative counting goes from start of the array and when negative from end of the array.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Merging accepted and rejected fields into one field responded can be done using $concatArrays operator like:

{
  "project": {
    "responded": { "$concatArrays": [ "$accepted", "$rejected" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "responded": [ "Alice", "John", "Ann" ] },
  { "id": 2, "responded": [ "Bob", "Mark" ] }
]

Boolean operators¶

Boolean operators combine one or many logical expressions.

Boolean operators:

Operator	Accepts	Returns
$and	any number of expressions	true if all arguments are truthful, otherwise false
$not	one expression	true if argument is not truthful, otherwise false
$or	any number of expressions	true if any argument is truthful, otherwise false

Example:

A previous stage returns following 5 documents:

[
  { "id": 1, "person": "Alice", "responded": [ "Alice", "John" ] },
  { "id": 1, "person": "John", "responded": [ "Alice", "John" ] },
  { "id": 1, "person": "Ann", "responded": [ "Alice", "John" ] },
  { "id": 2, "person": "Bob", "responded": [ "Mark" ] },
  { "id": 2, "person": "Mark", "responded": [ "Mark" ] },
]

Previous stage unwraped invited people into a person field while keeping people who responded in a responded field. To keep a person field and introduce waitingForResponse field, following project stage can be used:

{
  "project": {
    "person": "$person",
    "waitingForResponse": { "$not": { "$in": [ "$person", "$responded" ] } }
  }
}

Stage will produce following 5 documents:

[
  { "id": 1, "person": "Alice", "waitingForResponse": false },
  { "id": 1, "person": "John", "waitingForResponse": false },
  { "id": 1, "person": "Ann", "waitingForResponse": true },
  { "id": 2, "person": "Bob", "waitingForResponse": true },
  { "id": 2, "person": "Mark", "waitingForResponse": false },
]

Comparison operators¶

Comparison operators are mainly used in filter stage to produce logical expressions. However, in project stage can be also used to store expression value in a field.

Filter stage expects document field to be specified against comparison operator using syntax: { <FIELD_NAME>: { <OPERATOR>: <EXPRESSION> }.

Comparison operators available in filter stage:

Operator	Accepts	Returns
$eq	one expression	true when field value is equivalent to expression result, otherwise false
$gt	one expression	true when field value is greater than expression result, otherwise false
$gte	one expression	true when field value is greater than or equivalent to expression result, otherwise false
$in	any number of values	true if field values is equivalent to any of the given value, otherwise false
$lt	one expression	true when field value is less than expression result, otherwise false
$lte	one expression	true when field value is less than or equivalent to expression result, otherwise false
$ne	one expression	false when field value is equivalent to expression result, otherwise true
$nin	any number of values	false if field values is equivalent to any of the given value, otherwise true

Example:

A previous stage returns following 3 documents:

[
  { "id": 1, "rejectedCount": 0 },
  { "id": 2, "rejectedCount": 0 },
  { "id": 3, "rejectedCount": 1 }
]

Take documents where rejectedCount is 1 or greater, like:

{
  "filter": {
    "rejectedCount": { "$gte": 1 }
  }
}

Filter stage produces following document:

[
  { "id": 3, "rejectedCount": 1 }
]

Comparison operators available in project stage:

Operator	Accepts	Returns
$cmp	two expressions	1 when first expression result is greater than second, -1 when first expression result is less than second, otherwise 0 (they are equivalent)
$eq	two expressions	true when expression results are equivalent, otherwise false
$gt	two expressions	true when first expression result is greater than second, otherwise false
$gte	two expressions	true when first expression result is greater than or equivalent to second, otherwise false
$lt	two expressions	true when first expression result is less than second, otherwise false
$lte	two expressions	true when first expression result is less than or equivalent to second, otherwise false
$ne	two expressions	false when expression results are equivalent, otherwise true

Example:

A previous stage returns following 3 documents:

[
  { "id": 1, "rejectedCount": 0 },
  { "id": 2, "rejectedCount": 0 },
  { "id": 3, "rejectedCount": 1 }
]

Introduce field hasRejected containing boolean value indicating whether field rejectedCount is 1 or greater, like:

{
  "project": {
    "hasRejected": { "$gte": [ "$rejectedCount", 1 ] }
  }
}

Project stage produces following 3 documents:

[
  { "id": 1, "hasRejected": false },
  { "id": 2, "hasRejected": false },
  { "id": 3, "hasRejected": true }
]

Conditional operators¶

Conditional operators execute accepted expression when certain condition is satisfied.

Conditional operators:

Operator	Accepts	Returns
$cond	three expressions	first expression should resolve to a boolean value and if true, second expression is evaluated and returned, otherwise third expression is evaluated and returned
$ifNull	two expressions	first expression result when result is not null, otherwise returns second expression result

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Adding field message with value Has rejected when rejected array contains one or more items, otherwise use value None rejected, like:

{
  "project": {
    "message": {
      "$cond": [
        {
          "$gte": [ { "$size": "$rejected" }, 1 ]
        },
        "Has rejected",
        "None rejected"
      ]
    }
  }
}

will produce following 2 documents:

[
  { "id": 1, "message": "Has rejected" },
  { "id": 2, "message": "None rejected" }
]

Literal operators¶

Literal operators supress evaluation of containing expressions.

Literal operators:

Operator	Accepts	Returns
$literal	one expression	expression as a value without evaluating it

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "accepted": [ "Alice", "John" ], "rejected": [ "Ann" ] },
  { "id": 2, "accepted": [ "Bob", "Mark" ], "rejected": [] }
]

Field path expression in acceptedEvaluated and acceptedLiteral where latter is wrapped with literal operator, like:

{
  "project": {
    "acceptedEvaluated": "$accepted",
    "acceptedLiteral": { "$literal": "$accepted" }
  }
}

will produce following 2 documents:

[
  { "id": 1, "acceptedEvaluated": [ "Alice", "John" ], "acceptedLiteral": "$accepted" },
  { "id": 2, "acceptedEvaluated": [ "Bob", "Mark" ], "acceptedLiteral": "$accepted" }
]

Object operators¶

Object operators perform transformation operations based on one or many objects.

Object operators:

Operator	Accepts	Returns
$mergeObjects	any number of objects	new object containing all keys/values from the given objects. Merging is done from left to right, meaning when two or more objects have the same key, values of an object which is last in argument list will be taken.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "f1": { "a": 1 }, "f2": { "b": 2 } },
  { "id": 2, "f1": { "a": 2 }, "f2": { "b": 3 } }
]

Merge objects under fields f1 and f2 as new field r, like:

{
  "project": {
    "r": { "$mergeObjects": [ "$f1", "$f2" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "r": { "a": 1, "b": 2 } },
  { "id": 2, "r": { "a": 2, "b": 3 } }
]

Set operators¶

Set operations perform transformation operations based on one or many sets, i.e. arrays with distinct items.

Set operators:

Operator	Accepts	Returns
$allElementsTrue	one array	true when all values in the array are truthful, otherwise false
$anyElementTrue	one array	true when some value in the array is truthful, otherwise false
$setDifference	two arrays	new array with items from the first array which are not contained in the second
$setEquals	any number of arrays	true when arrays have same distinct items, otherwise false
$setIntersection	any number of arrays	new array with items from the first array which are contained in all other arrays
$setIsSubset	two arrays	true when first array items are contained in the second, otherwise false
$setUnion	any number of arrays	new array containing distinct items from all arrays

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "a": [ 1, 2, 3 ], "b": [ 4, 5 ] },
  { "id": 2, "a": [ 1, 2, 3 ], "b": [ 2, 3, 4 ] }
]

Creating a field all containing distinct items from fields a and b, like:

{
  "project": {
    "all": { "$setUnion": [ "$a", "$b" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "all": [ 1, 2, 3, 4, 5 ] },
  { "id": 2, "all": [ 1, 2, 3, 4 ] }
]

String operators¶

String operators perform transformation operations based on one or many values of type string.

String operators:

Operator	Accepts	Returns
$concat	any number of strings	new string by joining all given strings. If any argument resolves to null, operation returns null.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "firstName": "John", "lastName": "Smith" },
  { "id": 2, "firstName": "John", "lastName": "Doe" }
]

Concatenating fields firstName and lastName to form field fullName, like:

{
  "project": {
    "fullName": { "$concat": [ "$firstName", " ", "$lastName" ] }
  }
}

will produce following 2 documents:

[
  { "id": 1, "fullName": "John Smith" },
  { "id": 2, "fullName": "John Doe" }
]

Type operators¶

Type operators are able to accept any data type.

Operator	Accepts	Returns
$toObjectId	one expression	creates object ID of the given value. If value is null, operation returns null.
$toString	one expression	string representation of the given value. If value is null, operation returns null.

Example:

A previous stage returns following 2 documents:

[
  { "id": 1, "processed": 5, "total": 20 },
  { "id": 2, "processed": 3, "total": 10 }
]

String operator $concat accepts only string values. In order to create string value based on integer fields, operator $toString can be used:

{
  "project": {
    "message": {
      "$concat": [
        "Processed ",
        { "$toString": "$processed" },
        " out of ",
        { "$toString": "$total" } ]
    }
  }
}

Project stage will produce following 2 documents:

[
  { "id": 1, "message": "Processed 5 out of 20" },
  { "id": 2, "message": "Processed 3 out of 10" }
]