Wednesday, May 1, 2013

MongoDB quick thoughts

This time I'm writing of my experiences and random thoughts about MongoDB. Just a quick overview and nothing too profound.

My MongoDB experiences

I've been a part of a development team in two projects that used MongoDB as a database and in addition earlier this year I attended and completed MongoDB for Java developers online course by 10gen, the company behind MongoDB. 

Moving from relational databases to document databases isn't easy. I don't have any real experience with functional programming but I'd imagine moving from object oriented programming to functional programming is some what similar experience as moving from relational databases to document databases. Some of the rules are still the same but there are a lot of differences.

Flexible schemas

MongoDB has flexible schemas meaning that the data model in the collection can be changed per document a any time. I really like this as it gives the opportunity to store only and all the data that is needed per document. 

When the schema is flexible it means that there's no need to store null values for anything. In the example below we have two documents in a collection named "contacts".

{ _id: 1, name: "Joe", email: "", phone: "1234567"}
{ _id: 2, name: "Andy", phone: "7654321" }

Were storing a contact list in the database where Joe has a email address and that's stored under key "email" and Andy has only phone number. If we were storing this same information in a relational database we'd have to store a null or empty string as Andy's email address because in the schema there's a column for email address.

Having a flexible schema still means that schema and the data model of the application has to be thought well. With the flexibility comes a cost, it means that much of the logic has to be in the application.

Indexes and searching

Another great thing in MongoDB are indexes. They work pretty much the same way as indexes work in relational databases.

Querying is a basic functionality in MongoDB it can be done against any key-value pair in a document collection. Querying is always more efficient if it's done against indexed values.

These are two things that separates document databases from key-value stores where indexing is done only on the key and querying can be done only against the key. There are some separate solutions for key-value store indexing and searching but their not part of the database itself.


Aggregating is a more sophisticated way of searching it gives nice opportunities to modify the search result documents before the answer is returned. This is also a nice tool for querying for statistical data based on the documents.

Replicating and sharding

Replicating is where the data is copied to multiple databases and sharding is where the data is spread between multiple database instances. These two can also be combined where data is sharded and the individual shards are replicated.

Replication is a good way to have to the data backuped and for fault tolerance and it can be used to spread reads against multiple databases. Replication also gives a opportunity to confirm writes to multiple replicas before the write is considered successful.

Sharding is a way to spread reads and writes against multiple databases.

Final thoughts

Since I've used and learned more about MongoDB and document databases I've started to think differently about applications. 

In the past a relational database was the only choice as a database for me but now MongoDB is one of the alternative solutions, more on the other solutions on a later post. This new knowledge that I've gained has had me thinking of how some of the past solutions would have probably worked better if MongoDB had been the choice for a database instead of a relational database.