Skip to main content

MongoDB quick thoughts

This time I'm writing of my experiences and random thoughts about MongoDB. Just a quick overview and nothing too profound.

My MongoDB experiences


I've been a part of a development team in two projects that used MongoDB as a database and in addition earlier this year I attended and completed MongoDB for Java developers online course by 10gen, the company behind MongoDB. 

Moving from relational databases to document databases isn't easy. I don't have any real experience with functional programming but I'd imagine moving from object oriented programming to functional programming is some what similar experience as moving from relational databases to document databases. Some of the rules are still the same but there are a lot of differences.

Flexible schemas


MongoDB has flexible schemas meaning that the data model in the collection can be changed per document a any time. I really like this as it gives the opportunity to store only and all the data that is needed per document. 

When the schema is flexible it means that there's no need to store null values for anything. In the example below we have two documents in a collection named "contacts".

{ _id: 1, name: "Joe", email: "joe@foo.com", phone: "1234567"}
{ _id: 2, name: "Andy", phone: "7654321" }

Were storing a contact list in the database where Joe has a email address and that's stored under key "email" and Andy has only phone number. If we were storing this same information in a relational database we'd have to store a null or empty string as Andy's email address because in the schema there's a column for email address.

Having a flexible schema still means that schema and the data model of the application has to be thought well. With the flexibility comes a cost, it means that much of the logic has to be in the application.

Indexes and searching


Another great thing in MongoDB are indexes. They work pretty much the same way as indexes work in relational databases.

Querying is a basic functionality in MongoDB it can be done against any key-value pair in a document collection. Querying is always more efficient if it's done against indexed values.

These are two things that separates document databases from key-value stores where indexing is done only on the key and querying can be done only against the key. There are some separate solutions for key-value store indexing and searching but their not part of the database itself.

Aggregating


Aggregating is a more sophisticated way of searching it gives nice opportunities to modify the search result documents before the answer is returned. This is also a nice tool for querying for statistical data based on the documents.

Replicating and sharding


Replicating is where the data is copied to multiple databases and sharding is where the data is spread between multiple database instances. These two can also be combined where data is sharded and the individual shards are replicated.

Replication is a good way to have to the data backuped and for fault tolerance and it can be used to spread reads against multiple databases. Replication also gives a opportunity to confirm writes to multiple replicas before the write is considered successful.

Sharding is a way to spread reads and writes against multiple databases.

Final thoughts


Since I've used and learned more about MongoDB and document databases I've started to think differently about applications. 

In the past a relational database was the only choice as a database for me but now MongoDB is one of the alternative solutions, more on the other solutions on a later post. This new knowledge that I've gained has had me thinking of how some of the past solutions would have probably worked better if MongoDB had been the choice for a database instead of a relational database.

Popular posts from this blog

Sharing to help myself

It's been a while since my last post but I have a good excuse. I've been in a new customer project (well new for me) for two months now and have absorbed a lot of new information on the technology stack and the project itself. This time I'll be sharing a short post about sharing code and how it can help the one who's sharing the code. I'll be giving a real life example of how it happened to me. My story Back when I was implementing first version of my simple-todo REST-service I used Scala and Play framework for the service and specs2 for testing the implementation. Since then I've done a few other implementations of the service but I've continued to use specs2 as a testing framework. I wrote about my implementation and shared the post through various services and as a result someone forked my work and gave me some pointers on how I could improve my tests. That someone was Eric Torreborre  the man behind specs2 framework. I didn't take his ref

Simple code: Immutability

Immutability is a special thing that in my mind deserves a short explanation and praise. If you're familiar with functional programming you surely recognice the concept of immutability because it's a key ingredient of the paradigm. In the world of object oriented programming it's not as used and as easy to use approach but there are ways to incorporate immutability to parts of the code and I strongly suggest you to do so. Quick intro to immutablity The basic idea of immutability is unchangeable data.  Lets take a example. We have a need to modify a object's property but because the object is immutable we can't just change value but instead we make a copy of the object and while making the copy we provide the new value for the copy. In code it looks something like this. val pencil = Product(name = "Pencil", category = "Office supply") val blackMarker = pencil.copy(name = "Black marker") The same idea can be applied in functions and metho

Simple code: Naming things

There are two hard things in programming and naming is one them. If you don't believe me ask Martin Fowler https://www.martinfowler.com/bliki/TwoHardThings.html . In this post I'll be covering some general conventions for naming things to improve readability and understandabilty of the code. There are lots of things that need a name in programming. Starting from higher abstractions to lower we need to name a project, API or library, we probably need to name the source code repository, when we get to the code we need to name our modules or packages, we give names to classes, objects, interfaces and in those we name our functions or methods and within those we name our variables. Overall a lot of things to name. TLDR; Basic rule There's a single basic convention to follow to achiveve better, more descriptive naming of things. Give it a meaningful name i.e. don't use shorthands like gen or single letter variables like a, x, z instead tell what it represents, what it does