A few months have passed after rolling out our new SDLC process based on Git+GitHub+Jenkins at Convey, I decided to write this post on code reviews etiquette based on the observations I made on what works, what doesn't and the kind of behaviors to stress on developers and reviewers and those behaviors to discourage.
As a bit of a background, before we moved to Git+GitHub, we had our own home-grown version control system and a third party tool we used to drive our process. Our continuous integration system was partially based on standard third party tools and a lot of custom code to stitch the pieces. Everything was beginning to feel dated and inefficient and not so hip anymore.
We needed something newer, better, agile and why not ... also cool.
This was a good opportunity to attack an issue that had bug me for a while since I began to notice that not all developers are created equal (and neither is their code). And, yes, this took me some time to realize. Ok, to put it on more blunt terms: I wanted to attack head on the issue of sub-standard code making it to our central repository, the "golden copy", without any kind of scrutiny whatsoever. Yes, this happened to us, quite a lot and quite often.
We did mitigate this partially within the last couple of years with after the fact code reviews. It improved the code quality, but was a half-measure in my mind. I would not recommend this to any development leader as the end of the road.
To solve this persistent issue, for once and for all, we decided to embrace the open source community model based on pull requests from personal forks into a master repo wholeheartedly. I have to tell that this was initially very controversial. Developers were concerned about the scrutiny itself, efficiency, productivity, developers morale, etc. People fear change. We test drove the model for about one year and half with a selected set of projects to gather reactions and tune the process, before we released it to all teams and projects.
If there's something foundational, critical and invaluable that I would not get rid of in our current SDLC, that's the pre-integration code review process built on top of GitHub Pull Requests. That's probably the single and most powerful change we introduced that created the most profound change of behavior within the teams and individual developers.
On this point, I think we have long ways to go yet in terms of developers' maturity to understand what's expected of the pull requests based code review process and how to fully reap the benefits of it for their personal growth as professionals and to improve the company's code base.
This leads me to write these guidelines to make the best out of it:
For developers posting Pull Requests
- Post small pull requests. Don't cram together a lot of unrelated changes. Reviewers will find your pull request painful, and two things may happen:
- It will get approved very quickly out of "lack of time" (not a good outcome).
- They will be ruthless out of being upset on the sheer size and mess of your pull request (not a good outcome either).
- If you plan to do a "re-formatting" pull request. Do it on its own, don't mix it up with a feature pull request. Format changes are shown in diffs as every other change, making more painful for reviewers to try to understand the nature of the intended change. Small and limited format changes are OK within the context of a feature pull request.
- Post the pull request once you are confident the code works. Hopefully you created automated tests. If automated tests were too expensive to write, at least make sure your code compiles and runs well. Believe, there's nothing more embarrassing that having a compilation syntax error pointed out by a human compiler doing a pull request code review.
- Write automated tests. Write automated tests. Write automated tests. Make them part of the pull request. I can't stress it more, and I won't write more about this here when there's plenty of material on the web that outlines in detail the benefits of it.
- Answer ALL comments reviewers make to your code. Do it promptly, don't wait days before you reply. Reviewers with merge ability will wait until all issues in the code are addressed. Sometimes addressing them is simply a recognition that you are going to fix the issue later, on another commit. If you don't answer the comments, the pull will remain there open for everyone to see that you don't care.
- Be grateful that there's someone out there willing to review your work and provide you feedback. Yes. This IS valuable. It proves that someone cares about what you are doing. Believe me, there's nothing more frustrating that making a pull request on an open source project only to see your work rotting out there with no response from the repo owner. I prefer a blatant rejection of my work than seeing pull requests completely ignored for months.
- Don't keep appending changes to an existing pull request that are unrelated to the intent of it. If you start working on a different ticket, create a new branch, commit and open a new pull request based on that branch. If you don't do that, you will be breaking recommendation #1 above.
- Before you click the "send" button when creating the pull request, use the opportunity to perform a final review of what you just did. GitHub provides a nice platform to read diffs. Use it.
- Finally, don't get upset if a reviewer points out really bad and ugly code you just posted. Don't take comments on code personally. Nobody is making a comment on you, but on the code itself you just posted.
For reviewers/repo owners
- If you are on charge of a repo, be prompt to review pull requests. If you are in "reviewing" state, but you don't find specific things to comment on the code, at least write a comment stating you are working on it. That way other reviewers will know there's someone taking care of a particular pull request.
- If you don't know exactly what's happening with some piece code, don't be shy and just leave it. Ask questions. I've seen cases where pull requests are approved because there's no developer who really knows what's going on other than the one who wrote the code to begin with. It's really bad when this happens. There's no scrutiny of the logic and the code, and this lends itself for high chances of buggy code getting integrated. At the same time, you can ask other reviewers to check it out and help you with the review. The number of people reviewing the code should be correlated to the complexity and extent of it. This is a case where more prying eyes are better.
- Be kind yet candid in your comments. Don't embellish your words if you are meaning to pin point a really bad issue. Just say it. Don't offend the developer but feel free to "offend the code" if necessary. Recognize we all make mistakes. We all have the right to know when we are doing a horrible job, but we want that experience to happen in the context of respect.
- If you see people is hesitant to respond to your comments and adopt the social aspect of these code reviews, try posting your comments in the form of questions, such as "I think this be better implemented by... what do you think?". That might inspire people to interact with you via the Pull Request comments feature.
- If you find recurring errors/issues in the code, comment each of the instances of the problem, or clearly specify the scope of the review (function, class, file). That will save you from the iterations caused by the developer only fixing the instance you pointed out.
- If you find a really awesome piece of code, something you would do an old fashioned paper print out and frame it, just say it on the open with a comment. As much as we tend to focus code reviews to point flaws or ask questions, it's great to praise someone else's clever, smart and creative work. Don't overdo this one. My rule of thumb is to reserve this one for truly remarkable pieces of code.
- Just like developers you should answer all comments back. If a developer write back with a comment on something you pin pointed, unless the issue is closed, answer back. This is basic respect for the other.
- Once all issues are cleared, and if there's no other impeding aspects, merge the code promptly. There's nothing better for a developer than see their code being accepted into the main repository.
- Finally, don't lower your guard!. I see reviewers getting caught on the trap of "feeling too busy for reviews", "deadlines are all over us", "I have no time for this", "It's only one line, I will merge it quick", "we haven't had issues for a while so this must be ok" and all this lame excuses not to conduct proper code reviews. When this happens, code is merged without scrutiny and the possibility of catching issues and maintaining the codebase in good shape is plundered. Reviewers have a responsibility to honor those who created the product that today is in production serving clients. The best way to honor this responsibility is by preventing crappy code of ruining existing working code.
Finally, to be consistent with the principle of what I'm praising in this blog post: the value of pre-integration code reviews, I want to thank Poyo Levin and Rupio Dagum for reviewing the draft of this post, correcting grammatical horrors, pin-pointing typos and suggesting additions that were gladly accepted and incorporated in the final version.