MongoDB Indexes

Ward Price
4 min readFeb 1, 2021

I’m currently in the process of taking what I learned from the Flatiron School’s Software Engineering program and building upon it. Right now, I’m focusing on learning how to make a backend API using Node.js and also giving noSQL databases a go. Welcome MongoDB.

The difference between using MongoDB and Postgres (what I used in past projects) is pretty clear. A MongoDB database just looks like JSON. When working directly in the shell, all queries are written in Javascript too. It all feels quite comfortable. One feature I found interesting was the ability to create indexes.

An Index is its own data structure—a pointed list of a field or collections of fields sorted by their values. Because Indexes are sorted, traversing the collection for a particular value or a range of values, they are super efficient. Every project has at least one index. That’s because MongoDB creates an index by default for the _id: field for every new collection. With this default field, MongoDB makes sure that an _id: is not used more than once, and allows for quick read times when we query for a document in the collection.

Creating a simple index is very easy. I’m going to write examples as if we were working directly in the shell. If you haven’t installed the shell on your computer yet, check out the installation documentation MongoDB provides.

First, fire up MongoDB in your terminal by entering the command mongo. Let’s create a new database by calling the use command, along with the name of the database we will be working in. I’ll name this database lookup.

> use lookup

The MongoDB Shell will return switched to db lookup confirming the database was created and that we are not working inside of it. Inside of each database, you create different collections in a similar way to how we have tables in a SQL database. But unlike a SQL databases, we relate information either by embedding or referencing documents from our collections. You can read more about embedding and referencing data from Pat Whitrock’s excellent blog post MongoDB Relations.

Let’s now create the single collection we’ll use for this example. In your terminal type:

> db.createCollection("people")

Next, let’s add some data. To really see the performance change we need a lot of documents to go through. I’ve created a script that will do this for us. It is written using Deno, which I highly recommend you checkout. I’m also using two dependencies: the MongoClient for Deno and deno_faker.

To run the script, cd to the directory where you’ve placed it, and run the following command. Make sure to use the --allow-net flag to give Deno access to the network. Note: this is going to take some time to complete.

$ deno run --allow-net peopleSeed.ts

The script will create 1 million people documents inside the people collection. Every document in the people collection will have a field of name and age. You can double check the count by using the handy count() method.

> db.people.count()

Now, let’s do a quick query to find out how many of the documents have an age older than 30.

> db.people.find({age: {$gt: 30}}).count()

If we run the same command again, but remove the count() method and insert the .explain() method with the parameter of 'executionStats', we can see how MongoDB found the result along with the time it took.

> db.people.explain("executionStats").find({age: {$gt: 30}})
ouput without an index

We see that it used the COLLSCAN approach to find the values. This is seen in the “winningPlan” field. What this means is that it looked into every single document one by one to check whether the age was greater than 30. This is also confirmed by looking at the "totalDocsExamined" field which has the value 1000000.

So, let’s have a look at how indexes will make the query more efficient. First, let’s create an index.

> db.people.createIndex({age: 1})

This creates an index on the age field and the value of 1 sets it to ascending order. If we wanted descending order, we would put a -1.

Once again, we’ll look at the .explain("executionStats“) output.

> db.people.explain("executionStats").find({age: {$gt: 30}})
output with an index on age

We can now see that some things have changed. In the “winningPlan” field the document below uses FETCH which uses IXSCAN (IXSCAN stands for index scan). It utilized our Index’s data structure to efficiently find all the documents with ages greater than 30. We can also confirm this by looking at how many documents were examined, ”totalDocsExamined” : 617408,, while also looking at how many documents were returned, “nReturned” : 617408,––it’s the exact same amount!

--

--