Let's Refresh: MongoDB

In an earlier post on MongoDB, we used the Mongo Shell to build Aggregation pipeline. In this post, we will take a look at the new Aggregation Pipeline Builder feature in MongoDB Compass. Folks who have used Mongo Shell to build the Aggregation Pipeline will know that this task can be a tad tedious because of the syntax involved . In particular, when the pipeline is fairly lengthy, the developer may find it difficult to build and debug the pipeline. The Aggregation Pipeline Builder feature has been added in MongoDB Compass to help reduce the complexity in building the pipeline. For all the work in this post, we will be using the latest versions of MongoDB and MongoDB Compass: MongoDB 4.0 and MongoDB Compass 1.14.5

We will be using employees data that we have used in earlier posts. A few records of the data we will be working with are shown below:

Note that there is a new tab next to Documents called Aggregations. Click on that to navigate to the Aggregation Pipeline Builder that is shown below:

Let us quickly see the different sections and features of the Aggregation Pipeline Builder before we build a pipeline

The first section is the preview section. This sections shows a preview of 20 documents in the collection. Here it is showing the first 20 documents of the collection. You can see only 3 documents in the frame. To see the rest of the documents, one can use the scroll bar to scroll right. If the collection has less than 20 documents then all the documents are shown. The second section is the section where we can add stages. The stages can be either be added by selecting from the dropdown or typing in the stage name and then selecting the stage from the filtered selection shown in the dropdown based on what we typed. Once the stage is added and necessary details filled in, we can see a preview of the documents in the third section. If we wish to add a new stage, we can click on the ADD STAGE button below the second section. This will be bring up a new Stage Addition Section with a corresponding preview section on the right. Next, let us understand how we can enable certain features shown below:

If the COMMENT MODE is enabled, then, when a stage is added in the Stage Addition Section, comments with information about that stage are shown to help the developer complete that section. If SAMPLE MODE is enabled, then, input documents are limited to 100000 before certain stages as $group, $bucket, and $bucketAuto. If the AUTO PREVIEW is enabled, then, we can see a preview of resulting documents as a stage is added (third section in earlier schematic). We can now build a simple pipeline to understand the pipeline build process.

Let us add a match stage that will select only those employees who belong to DEPARTMENT_ID less than 40. Click on Select and enter m in the field in the Stage Addition Section. This brings up three options. Select $match from them:

Once this is selected, comments and syntax corresponding to $match are added below:

In the place of <<query>>, enter D. Two options are shown:

Select DEPARTMENT_ID and enter :{ $lt: 40 }

Note that as you enter $, you are prompted with options. As soon as this whole expression, :{ $lt: 40 }, is entered, you will see a preview of documents on the right that match this criteria in this stage:

Let us add another another stage that calculates the sum of salaries grouped by JOB_ID. Click on ADD STAGE button to add next stage:

Add $group as next stage:

For the four fields, enter below values:

<<expression>> = "$JOB_ID"
<<field1>> = SUM_SALARY
<<accumulator1>> = $sum
<<expression1>> = "$SALARY"

Note that except for <<field1>>, for all other placeholders, options are shown as you start entering values. After the last entry, you will see results on the right:

The results compare well with the first query on this post.

One can copy the pipeline to clipboard for any further analysis as shown below:

You can also save the pipeline for future reference by entering a name in the field that says "Enter a pipeline name..." and clicking on SAVE PIPELINE button as shown below:

Click on Toggle Saved Pipelines to see saved pipeline as shown below:

We can clone the existing pipeline as shown below:

We can now save this as another pipeline and continue our work while the original pipeline is preserved. After clicking on SAVE PIPELINE button, disable the $match stage by sliding the green enable button to left as shown below:

Note that the results of the next stage are altered. This is because now the documents are not filtered by the match stage. We can now again enable it by sliding the enable switch to the right. Knocking out a stage and enabling it back is as simple as that. If one wants to delete the stage from the pipeline, one can click on the delete button in that section. But, we cannot restore this stage back and we have to build it back to restore it.

Since the $match stage is not operational, we can drag an entire stage like the $group stage to be the first stage as shown below: