In part 1, we learned how hardware affects MongoDB performance. We also saw what indexes are and how are they stored on disk.
In this part, we will dive deep into types of indexes and their usage.
First in the list are:
Single field Indexes
These are simplest indexes in MongoDB.
Syntax: db.<collection>.createIndex({<field>: <direction>})
<collection> will be replaced with actual collection name in above query.
Similarly, <field> will be replaced by actual field name
Lets say we have a people collection in some database which contains following fields
first_name
last_name
ssn
job
address
email
...
Run the following query and examine the results:
db.people.find({ssn: "720-38-5636"}).explain("executionStats")
Results:
{
"queryPlanner" : {
...
"winningPlan" : {
"stage" : "COLLSCAN",
"filter" : {
"ssn" : {
"$eq" : "720-38-5636"
}
},
"direction" : "forward"
},
"rejectedPlans" : []
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 698,
"totalKeysExamined" : 0,
"totalDocsExamined" : 50474,
...
},
...
}
See the value of winningPlan
, the stage is COLLSCAN
, this means the data is fetched via a column scan which means that MongoDB had to go over each row and check for the specified column.
Also see, executionTimeMillis
is the time the query took and totalDocsExamined
is the number of documents the MongoDB server had to examine/check to find the result.
Now, lets see how adding an index affects our query.
Let's create an index on ssn
field by running:
db.people.createIndex({ssn: 1})
Now, again run the previous query and examine the output.
{
"queryPlanner" : {
...
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
...
"direction" : "forward",
"indexBounds" : {
"ssn" : [
"[\"720-38-5636\", \"720-38-5636\"]"
]
}
}
},
"rejectedPlans" : []
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 5,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
...
}
},
...
}
Again, See the value of winningPlan
, the stage is IXSCAN
which means that data is fetched by scanning index.
Also see the value of executionTimeMillis
, it says 5 milliseconds, which is way much smaller than 698.
totalDocsExamined
is 1 which means MongoDB had to go through only 1 document to fetch the results.
See, how performant are indexes. We just saved a lot of time for getting the same result.
MongoDB also allows to create index and query on a sub-document and fields of a sub-document. But we should never create indexes on entire sub-document because then we have to query on the entire sub-document. Instead we should index a field of a sub-document.
Indexes also work on ranges,
$in
queries and if we query on two fields where only one field is indexed.
Run:
db.people.find({ssn: {$in:["001-29-9184", "177-45-0950", "265-67-9973"]}, last_name: {$gte: "H"}}).explain("executionStats")
and examine the results
Sorting
In memory - Data is copied from disk to RAM and sort is applied, hence slow and expensive. 32 MB is limit and when its reached, sorting is aborted.
Using indexes - Since fields are already ordered by the field specified during index creation, there is no need to sort. Hence its much faster and efficient. E.g. if we have an ascending index on
ssn
, the records will be automatically sorted.Try the following:
db.people.find({}).sort({ssn: 1}).explain()
db.people.find({}).sort({first_name: 1}).explain()
And examine the results
When sorting by an indexed field, asc and dsc sort will have the same effect.
We just learned about Simple indexes. Now lets learn about more powerful index which is
Compound Index
An index on two or more fields is called compound index. For example, index created on both last_name
and first_name
will be called compound index.
db.people.createIndex({last_name: 1, first_name: 1})
Compound indexes are also stored in one dimension.
Analogy for above example can be seen in phone books
Now before we move forward, here is a task for you.
Create a
comments
collection.Insert some data(can be copied from here)
Run a find query
db.comments.find({name: "alias odio sit", email: "Lew@alysha.tv"}).explain("executionStats")
Examine the results
create a simple index on
name
, run the same query and check resultscreate a compound index on
name
andemail
, run the same query and check results.
Okay, so, this is too much to digest in one go. Take your time to understand and run the queries.
There is still so much to learn. We'll continue with other types of indexes in upcoming part.
Stay tuned and happy learning... : ).