Understanding MongoDB performance - Part 1 Introduction

2 67

Written by

No bio yet...

3 years ago

Topics: Mongodb, Technology, Databases, Performance, Freewriting, ...

Most of you must be using MongoDB for storing data. But do you know how performant is your MongoDB?

In this part(part 1), we'll understand the aspects on which MongoDB performance is dependent and the things to be considered while using MongoDB.

Hardware consideration

The first thing is hardware(if you wanna setup your own machine or buy a machine). And the few things to consider are:

Memory

More memory(RAM) means more performance. Memory is used for the following operations:

Aggregations
Index traversing
Write operations
Query engine (to retrieve query results)
Connections (each connection roughly requires 1MB space)

Each process in our system requires CPU. MongoDb tries to use all available CPU cores. More CPUs means more concurrent request can be satisfied hence more performance. Some operations such as Page compressions, Data calculations, Aggregation framework operations, Map reduce, etc. require CPU.

IO(HDDs and SSDs)

Data are stored to disk for persistence. IOs with more IOPS(Input/ Output Operations Per second) can lead to better performance.

Our disk architecture also affects our performance. Mostly used architecture for disks is RAID (redundant array of independent disks) architecture. RAID architecture level 10 is most widely used and recommended for MongoDB.

You can read more about RAID here.

Network

Network bandwidth also plays an important role in MongoDB performance. Larger bandwidth more performance. Even the network switches, load balancer and firewall contribute to the performance.

Indexes

Next thing to consider is indexing your data.

Indexes are like indexes found in a book for quick reference/search of some text. Indexes are used to make our search/queries fast.

Just as without an index in the book, we have to go through each page to search the desired content, similarly without indexes, we have to scan each and every document in the collection to satisfy our search query.

Without indexes the number of documents scanned will increase linearly with the increase in the number of documents.

Indexed data is store in memory. The value of the field that we indexed is stored as the key and the reference of the actual corresponding document is stored as its value. If for some document, the value of indexed field is not present then the key “null” will be stored.

“_id” field is automatically indexed in every collection. Indexes can decrease write operation performance because the indexes might need to be adjusted upon a write operation

Now, let's see how data is stored on disk.

On disk, data is stored at path specified as dbPath while running the mongod server.

Run MongoDB server with following configuration and see the list of files at our dbPath after inserting something into the database.


storage:
  dbPath: /var/mongodb/db/mongo
systemLog:
  path: /var/mongodb/db/mongo/mongo.log
  destination: file
  logAppend: true
net:
  bindIp: 127.0.0.1, 192.168.103.100
  port: 27000
security:
  authorization: enabled
processManagement:
  fork: true

List of files: Note that all data and index related files are here. This is how files are stored by default.

Now, let's see how we can change the default structure.

Shutdown the MongoDB server, delete and re-create the dbPath folder, run again with following configuration, insert some data and list the files.


storage:
  dbPath: /var/mongodb/db/mongo
  # this means that for each database there will be a single/separate directory 
  # assigned to it. 
  directoryPerDB: true
systemLog:
  path: /var/mongodb/db/mongo/mongo.log
  destination: file
  logAppend: true
net:
  bindIp: 127.0.0.1, 192.168.103.100
  port: 27000
security:
  authorization: enabled
processManagement:
  fork: true

List of files:

as you can see that for each database there is a separate directory.

Now again follow above steps and run MongoDB server with following configuration.


storage:
  dbPath: /var/mongodb/db/mongo
  directoryPerDB: true
  # this means there will be separate directories for collection and indexes
  wiredTiger:
    engineConfig:
      directoryForIndexes: true
systemLog:
  path: /var/mongodb/db/mongo/mongo.log
  destination: file
  logAppend: true
net:
  bindIp: 127.0.0.1, 192.168.103.100
  port: 27000
security:
  authorization: enabled
processManagement:
  fork: true

List of files:

You can see that are separate directories for collection and indexes.

Having separate directories for indexes and collections is beneficial when we have multiple disks where one can be used for indexes and other can be used for collections. Symbolic links are created between multiple disks to use them as one and to access data. More concurrent requests can be satisfied.

Sponsors of Skimo

empty

Become a sponsor

Get sponsored

So, that concludes our part 1. In part 2, we'll learn more about indexes, their types, and how to use them.

Stay tuned.... :)

And Happy Learning...