|
Code Reviews in Practice
Code reviews are one of the best things you can do to improve code quality within an organisation, but the process isn't a silver bullet, and it isn't guaranteed to succeed. These are my observations on what works and what doesn't work when it comes to implementing code reviews into a software development environment.
I've worked at 3 companies now where code reviews have been introduced. At the first two it was a failure, but for different reasons each time. Only the third company has fully integrated code reviews and unlocked the full benefit of them.
Company A
There are different ways of implementing code reviews and the implementation plays a huge part in whether the process will be successful in the long term. At company A code reviews were implemented as a pre-commit script that checked for the phrase "Reviewed by " in the checkin comment. If the phrase wasn't present then the commit would fail. At first this worked fairly well, but it wasn't long before the process broke down.
The coders started to buddy up for the reviews and the reviewing became cursory at best. Everyone had their review buddy and eventually the "review" became a matter of just informing your buddy what you were checking in. Eventually people stopped bothering at all, even sometimes putting their own name as the reviewer.
The problem here wasn't really the process itself, it was more a matter of management and culture. It was never made clear to the team why this process was being imposed on them or what the benefits would be. This led to resentment and resistance towards the review process.
It is vital to get the team to understand and commit to the process on an individual basis.
For example, it was thought that the reviews were largely meant to catch coding errors. While reviews can catch coding errors they are also meant to improve code quality. That might seem like the same thing, but it really isn't, although the two are related. Code can be completely technically correct, while being still being low quality in terms of readability, maintenance and error handling.
Improving code quality is meant to make the code more robust and maintainable so that bugs aren't introduced later. We do this by applying a standard. This helps to keep things impersonal. Everyone has to stick to the same standard.
Over time the company culture might have been changed so that the team accepted the review process, but since it wasnt enforced the process withered and died.
Company B
At this company reviews were implemented using actual review software. I don't recall the name of it, but you would submit your changes to the server and then wait for people to give feedback. This kind of process is better suited to cold reviews as it can take a long time.
The reviews often became rambling discussions of syntax as everyone gave their $0.02 on the code. These discussions are best kept out of the review process in order to keep the review as short as possible. This is also another reason why having a coding standard is important, because then the reviews can concentrate on more important matters as syntax/formatting issues are resolved quickly.
The culture here was more suited to code reviews, but ultimately the process died anyway. It was simply too time-consuming.
Company C
Within this company the review process is widely accepted and recognised as playing a vital role in maintaining the integrity of the codebase. In terms of implementation the process is almost identical to that of Company A. The differences are purely in terms of company culture and management involvement.
There is a culture of commitment to quality that ensures the coders buy in to the process, understanding why the reviews are important. Other factors that help the process are an enforced coding standard, smaller commit sizes and enforcement from management.
Coding standards are a good idea whether you have code reviews or not, but the two complement each other nicely. Without code reviews, the standard might not be applied as rigorously as we'd like.
Without the standard, the code reviews could turn into free-ranging discussions of unimportant syntax details. The details of the standard (edit: in terms of formatting) don't really matter, it just matters that you have one. Every developer should have the maturity to adapt their coding style to the standard.
The smaller commit sizes help to keep the reviews short, as well as ensuring that all the coders are keeping in synch and keeping merge conflicts to a minimum.
Conclusions
The two most important factors to the success of a code review process are:
-
Length of reviews
-
Having a coding standard, and enforcing it
Keep the reviews short and sweet. This means conducting the reviews face-to-face if possible, keeping commit sizes small and doing most reviews one on one.
The process requires involvement from management, in terms of coding standards, but also in terms of getting the coders to buy-in to the process and understand why it is important.
|
They're basically trying to sell their software; but, hey, it's a great read, and free!
@Andrew: Thanks for that, I've ordered a copy of the book. I had been intending to read a bit about the subject, but I wanted to braindump beforehand. :)
When doing a code review, just focus on the actual high-level reasoning behind the code, make sure it's globally understandable and makes sense in the context of the project, look out for any risks to the rest of the codebase, and please, don't sweat the underscores.
"look out for any risks to the rest of the codebase"
This is the most important thing, but I think it should include things like const-correctness, protecting member variables, and other best practices. I would imagine that to most coders it seems obvious to initialise all your members variables, but some coders might not want to for performance reasons. If it's in the standard then it shortcuts any arguments, and basically the standard takes the role of a moderator in the code review.
I would argue that this is not a positive thing. You just gave an example where not initializing a variable would make sense, and you can probably think of many where it wouldn't. Having something like "Always initialize all variables." in a standard wouldn't really be a positive thing then, since as you see it, you just can't argue with the Standard. Even more dangerous is the place where you basically say that you don't care what's in the standard, just that it exists.
It's a way to abdicate in the face of actual problems, and is in no way a solution. If there are heated arguments about which way to code something, then that's a hint that it's not the kind of argument best solved by a far-reaching standard. Except in the case of syntax, where basically nobody should really care anyway.
You should be able to argue against the standard, but you would have to have a good case for it.
"Even more dangerous is the place where you basically say that you don't care what's in the standard, just that it exists."
I was talking there about formatting/syntax issues, which I realise isn't clear. Any coder should be able to adapt their coding style to suit whatever conventions exist in the codebase he is working with. So in that sense it shouldn't matter what the formatting conventions are so much. As you said yourself "don't sweat the underscores". I think it is important though that everyone stick to same style when working in the same code base. It improves readability.When you can go anywhere in the code base and be assured that the conventions are the same then things just go that little more smoothly.
Louis, i seriously doubt that you have any experience at all coding on the sizable project. The kind of project which required more than a couple of coders banging out code in the basement somewhere. The project which would take years to complete.
At first coding standard would seem like an un-required rubbish, but think about it carefully.
In you own words :
"So what if the AI guys prefix their floats with f and the physics guys don't."
This is exactly the point of having the coding standard in-place. What if your Physics guys choose to opt their own special coding practices while the rest of the team is on the difference standard and the guy left in the middle of project? Now, no one on the team know about that guy's standard practices and have to dig deep into his code to be able to fix bugs or write some more functionality using his code base. It would take ages to wrap your mind-set around this guy's standard and be able to pin-point the bug in the code or understand how to use his code base. The time would be better spend elsewhere.
Further more consider this example :
//Java is used for reading simplicity here
String aString = "text not yet encoded";
String bString = Encoder.encode(aString);
// some more lines of code ... and you suddenly see this
bString = aString;
Didn't you spotted the wrong? This code would get no complaint from the compiler since its technically correct (a string can be assigned onto another string). But if your code expect a string which has already been encoding and do Decoder.decode(bString) would get rubbish.
What if the standard said you need to add "_es" suffix for every encoded string the code would look like :
String aString = "text not yet encoded";
String aString_es = Encoder.encode(aString);
and when you see
aString_es = aString;
This would certainly raise a suspicious about the code. It help coders on a different team spotted the run-time bug quickly.
Cheers
A while ago the company i worked for opt apply for ISO 9001/2008 for software design / delivery. There will be a section in the standard process doc which called "tailoring process" which means the team can create the exception to the rules if it required.
The point here is to see what is the best way to tackle a project. If the project required not initialize variable in favor of speed, then by all mean do it, but it need to be done "as a team". The whole team need to understand what the rules are and what are the exceptions for particular project or no body will bother following the rule and wreck havoc.
I agree that reviews should be relaxed. I completely disagree that they should be eliminated based on seniority. Noone should be arrogant enough to think that they never make mistakes and can never learn from others on their team. Code reviews also spread knowledge among the programming team. It might make sense to have a rule that says only a senior coder can review a junior coders work, but everyones work gets reviewed. I think to do otherwise undermines the whole process and would cause resentment.
If people see them as a gigantic pain then either the process needs streamlining or people don't understand the reasoning behind them.
"The whole team need to understand what the rules are and what are the exceptions for particular project or no body will bother following the rule and wreck havoc. "
You put it better than I possibly could, thanks!
@Sutthipong I think naming is actually one of the places where it is very hard to give a meaningful and workable policy that helps. You gave an example relating to handling of encoded and unencoded strings and suggested """What if the standard said you need to add "_es" suffix for every encoded string the code would look like :
String aString = "text not yet encoded";
String aString_es = Encoder.encode(aString);"""
I suspect that if you are standardising naming at that level, your standard will likely be very large, very quick. In your example it is trivial to convert the class structure to a form where the compiler can check for the error so that it is impossible to mishandle encoded strings.
String s = "text not yet encoded";
EncodedString es = Encoder.encode(s);
s = es; // compile time error
es = s; // compile time error
@Chris "Throwing out the standard because of a few rare exceptions will get you into far more." A good standard would hopefully have very few exceptions. I think standardising at the right level is important. You are arguing in favour of initialising all variables. For my work I would potentially disagree with that one. A more general suggestion is that all objects should exist in a well defined state. An object may contain unitialized variables if it's state is still well defined. For example, std::vector in C++ references memory that is both initialized and unitialized and yet always has a well defined state. Religiously zeroing all variables/members impedes your ability use valgrind and similar debugging tools to detect variables that are used before they are assigned a meaningful value.
Thanks again for the interesting read.