Skip to main content

Key-value stores from Redis point of view

This post was supposed to be about graph databases and key-value stores but it's going to be only about key-value stores because I got more interested in trying out Redis than Neo4J.

Redis

Redis is a key-value store that keeps it's database in memory but it also stores it's database on disk after a predefined time and number of changes in database. By default the values are like this:

  • 900 seconds and at least 1 change
  • 300 seconds and at least 10 changes
  • 60 seconds and at least 10 000 changes
More on Redis can be found at their website http://redis.io/ and if your interest to give it a quick try I suggest their online tutorial at http://try.redis.io/.

Important about querying

This is a important detail with key-value stores. In a key-value store the data can be searched only by the key. There are solutions that enable searching by the data, like lucene or solr, but that's a whole different search engine and not the actual key-value store.

It might appear strange or constraining but it just means that key-value stores dont suit everywhere and that the key must be chosen with care.

Key and values

Keys and values sounds simple and actually sounds pretty familiar to software developers. Key-value pair is basically a map, something like this in Java:
Map<String, String> myMap = new HashMap<String, String>();

In the value part one can store simple data like a name of a user "John Doe" or email address "johndoe@foobar.com" but these small bits of information aren't neccesarily enough. Another approach is to store json data that could be something like this:
{ "name":"John Doe", "email":"johndoe@foobar.com", "nick":"JD" }

With this kind of data structure it's possible to save all sorts of stuff but to do it so that the data is searchable the key has to be something meaningful. If the keys are just sequence of numbers like [1,2,3,4,5,6...] to search for "John Doe" from a database with thousands of key-value pairs it wouldn't be efficient as the keys would have to fetched and the data parsed until John is found. 
Let's pretend that the json data above is user data for a online service and users log in by their email address and a password. To choose something unique and searchable I would use the email address and to make it even more specific I would use a key that looked something like this:
"user:email:johndoe@foobar.com"

Now all we need to know is the email address (that we get in the login) and all the users data can be fetched with that.


Values as hash maps

This is something I really like about Redis, the value can be a map of values. Sounds a bit bizarre but is actually pretty simple once you get a hold of it.

Lets say I've created a simple blog platform and the blog posts are in this kind of structure where first is the key, post meaning this is a blog post, email of the user and a random uuid and as a value a json data set:
"post:johndoe@foobar.com:dsada23132" "{"title":"first post", "date":"20130101","text":"lorem ipsum...."}"

As a new feature the platform gets a commenting option and I want the comments be under the same post key so that they can be fetched at the same time as the post but I don't want to put them in the same json data. The new data structure would be something like this:
"post:johndoe@foobar.com:dsada23132" "post" "{"title":"first post", "date":"20130101","text":"lorem ipsum...."}"
"post:johndoe@foobar.com:dsada23132" "comments" "[{"name":"Jane Doe", "date":"20130102","text":"Nice one!"}, {"name":"Jack Doe", "date":"20130102","text":"Boring..."}]"

With the field values post and comments I separated the data from each other but kept it under the same key.


Searching data

Data can be searched only by the keys so if we know the key we can search with it like in the simpler key-value data with the email address. In the blog example the searching could be done with part of the key. If we wanted to get all John's blog posts we would do a search like this: 
"post:johndoe@foobar.com:*"

And after that I could get all the data of a specific entry with a get all command:
"post:johndoe@foobar.com:dsada23132"


Or if I wanted to get just the post not the comments the search would have a field with it:
"post:johndoe@foobar.com:dsada23132" "post"

Summary

There's much more of key-value stores and Redis that I didn't mention here and it can all be found at their web site but these are the important bits of my post.
  • Searching only by the key
  • Choose the key with care
  • Data can be simple... or not
I've done some brief experimenting with Java and Redis and some of the results can be found under my gthub account https://github.com/jorilytter/redis-test

Comments

Popular posts from this blog

Simple code: Immutability

Immutability is a special thing that in my mind deserves a short explanation and praise. If you're familiar with functional programming you surely recognice the concept of immutability because it's a key ingredient of the paradigm. In the world of object oriented programming it's not as used and as easy to use approach but there are ways to incorporate immutability to parts of the code and I strongly suggest you to do so. Quick intro to immutablity The basic idea of immutability is unchangeable data.  Lets take a example. We have a need to modify a object's property but because the object is immutable we can't just change value but instead we make a copy of the object and while making the copy we provide the new value for the copy. In code it looks something like this. val pencil = Product(name = "Pencil", category = "Office supply") val blackMarker = pencil.copy(name = "Black marker") The same idea can be applied in functions and metho

Simple code: Contracts

Code works around contracts and contracts should be carefully thought and crafted. What are contracts A High abstraction level of contracts for code are API's. They define a interface that is basically a contract that the producer and consumer of the API agree to use to communicate with each other. Two common forms of API's are libraries that are used in code and external API's  that are used via HTTP, RPC etc. When thinking in a bit deeper contracts consist firstly of functions, methods or external endpoints and secondly of data, more precisely on data models and data types within the models.   Defining contracts Contracts should always be defined with careful thought. I've come accross few times to someone saying that "this is for intenal use only so it doesn't need to defined and/or documented as thoughtfully as a public API would be" but I disagree with that. The same care should be be given to internal and external contracts because the contracts are

Simple code: Functions and methods

What makes a good function or method? I don't think it's a single thing but a combination of things where each is significant. If one the things is flawed it affects to all others and the whole function is flawed. So what are those "things"? Have a meaningful name Function should have a name that describes it's purpose or functionality. When a function has a meaningful name it's easy to read and understand what's it's purpose. Let's take a example. If function's purpose is to find a customer by it's id a good name could be findCustomerById(id: String) or it could just as well be just  findCustomer(id: String) because the function signature implies that the customer is found by it's id the word find also implies that the customer might be found or it might not be found. If the function's name would be changed to getCustomer(id: String) it's meaning changes because now it implies that there's no fallback, the customer is e