Using Aggregate in MongoDB

A journey from walkable to scalable code.

Image for post
Image for post

Introduction

Selects documents in a collection or view and returns a cursor to the selected documents. It basically performs a query to find the documents and usually returns an array of all the results.

Aggregation operations process data records and return computed results. Aggregation operations group values from multiple documents together, and can perform a variety of operations on the grouped data to return a single result.

No, find() is just a method returning a cursor to the documents. So it does not provide any pipeline features and does not includes features like transformation of data and staging of data where as aggregate is a pipeline introduced in MongoDB 2.2 update.

Cool but where to use aggregate ?

Basically we can use aggregate everywhere we want but it depends upon the speed and outcome we want from that aggregate when compared to find.

Lets take a scenario:

We want an array of all the events from the Event collection along with a field named like whose value is a Boolean, based on whether the user has liked the event or not for all the events present in Event Collection.

We have an array named liked which contains all the userid of all the users who have liked the event.

Schema of Event is:

Here liked array gives the array of userid(unique id generated for user) who have liked the event. The incoming request will contain the userid of the user making the request.

Now, Let’s use aggregate:

Along with aggregate we will be using some of the stages which are $unwind, $project, $cond, $group, $eq this stages will help us to filter our result and get the desired output.

In the first stage we use unwind. Unwind deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element. For example if we have 2 document which contains liked array of length 20 elements each so now what {$unwind:’$liked’} does is, it will create a document for each element in liked array such that we will have in total 1*20 + 1*20 = 40 documents OR 2 documents * 20 elements = 40.

We got the unwinded documents now our next stage is to project, okay what is this project ?

Project passes along the documents with the requested fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields. We can create new field too in project stage if we want. In our case we will be using it to project the fields as well as to assign the value of like field based on the condition provided.

Project here projects the necessary fields which includes _id, Event_Name, Event_Date, No_of_likes, Event_Details and a conditional field like.

$Cond stage is used to execute a statement of Aggregation Pipeline Operators $eq. Here we use an if block to check if the userid exist in the liked array of the event document if the condition is true we assign the value true to the liked field else false will be assigned.

Next Stage of our aggregate is $group. As the name suggest group its main function is to group input documents by the specified _id expression and for each distinct grouping, outputs a document. In our case we have used accumulator operator. Accumulator operators are used to accumulate the documents based on various operations like $first, $avg, $last, $max, $min, etc.

Since we have unwinded the documents we need to group them to get a result so we have used first and max. first accumulator returns a value from the first document for each group and max accumulator returns the highest expression value for each group which is either true or false for each document for each user.

Now we have reached the last stage where we use projection again to project the desired result. This is necessary as we have created a new field named like and assigned a value to it.

In the last projection we use $like to set a field value directly to a boolean literal based on the operator used, otherwise it will be used as a flag including or excluding the field.

The complete code looks like this :

Final result is the array of documents consisting the like value based on the user like and it can be rendered on front-end. This result will provide the user a visual result of their liked event.

Looks good, but I can do it with find() and filter functions

Image for post
Image for post

Yes we can do it if our database contains less documents around 1000. But if it goes beyond 1000 filter operations will require more cpu usage and your server may not able to handle much requests. Also if there are concurrent users the filter function will cause server to take more time to respond to one request making it difficult to process new requests.

Achievements ?

This code is a scaled code and can process multiple request also can provide response in few milliseconds. Also we are leveraging mongo driver to perform task and allowing our server to work on serving request.

Image for post
Image for post
we did it!

Stay Tuned !

Next post will be on consuming this result.

References :

https://docs.mongodb.com/manual/reference/operator/aggregation-pipeline

Full Stack | Ionic | Node.js

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store