Simple code: Version control commits

Currently the most popular version control system is git and I'll be writing this based on git and it's functionalities and capabilities.

Git is often seen as a way to enable distributed programming i.e. multiple programmers can work on the same code repository quite easily without disturbing each others work (much). In addition to that just like other VCS's it's also a log of work but to my experience that part is often unfortunately neglected. What I will be focusing this time is the log part because I think it deserves more attention.

Why to create a meaningful log?

The git log should consist from small meaningful changesets where each commit addresses a single problem. By dividing the log to small commits it enables resilient way of working. Being resilient enables simple and fast procedures to rollbacks, reviews, tags, branching etc.

Lets say that a developer is implementing a REST API. The API needs a web layer that receives the HTTP requests, it probably has some sort of logic layer to do data transformations and validations and maybe some calculations and finally it has a data storage where the data is persisted. There are options how to record this work to the log. One option would be to implement the API and record a single commit or squash the commits before pushing the changes to remote so it would become a single commit. Another option would be to record commits every now and then while developing and finally push those commits as is to the remote repository. Yet another way would be to carefully pick what is recorded per commit in order to have a set of meaningful commits that each address a single issue.

Example of the first approach would be something like this:

* Add API for movie ratings

The second approach might look something like this:

* Add DAOs
* WIP
* Fix
* Add REST API
* Fix
* Refactor
* ...

The third approach could be like this:

* Add DAO implementation to list movie ratings
* Add REST API endpoint for listing movie ratings
* Add validation of REST API's parameters to movie ratings listing
* Add transformation logic from movie rating DAO to REST API JSON
* Change movie rating listing to sort ratings ascending by the review date
* ...

If something was wrong with code e.g. in the validation logic it would be easy to see what commit introduced changes to there from the third and first example but from the second one not so much. With the first example we know it contains the unwanted behaviour but it also contains a lot of other changes too so we have to go through it all to see what has changed whereas on the third example it's quite obvious to see what changes introduced the validation logic and it's easy to isolate on what has changed at that point.

Another point for keeping a meaningful log is readability. It's much nicer to read a consistent log of small changes than a set of random commits or commits that introduce a whole lot of changes.

What a meaningful log should say

Ideally the log could be read so that you can read what's been done without actually looking at the changes, the code, at least on high level without going to the details.

The commit should explicitly say whether something was added, removed, fixed, refactored, rewritten etc. It should also say what was changed, not per file but per feature or use case. Finally the commit should say why it was done unless it's obvious, adding a HTTP endpoint doesn't need a separate reason but fixing a validation error would benefit from a short description.

How to create a meaningful log

TLDR; Piece by piece.

In perfect circumstances programmer would write a few lines of code and commit the changes but quite often it's hard to write code in that way. Some practices that help with this are TDD, test-driven development, and TCR, test and commit or revert. In addition to these two git itself provides a great set of tools to help split the work to smaller chunks. Some features that I use on daily basis are amend, rebase, interactive rebase, interactive add and stashing. With these features and healthy amount of self control I can produce quite good log that consists of small commits that each address a single issue and they're descriptive.

Next part

In the next part I'll be writing of a really important topic, naming.

Thoughts about software development

Search This Blog