A formal code review involves taking a good look at other people’s code. For some, trying to grok strange code is agony. But I like it.
We’ve experienced enough reviews to have developed an approach to getting into an alien code base. In this post I’ll outline the key items to focus on.
We’re not talking here about a review of a pull request. I mean a longer process, which some call a “code audit”, taking a broader view of an entire code base. It typically takes a couple of weeks to complete.
In time order, here are the five key items.
1. Why are we reviewing the code?
There’s usually a lot of code to look at. To narrow the scope, and get some results out quickly, we ask why we’re reviewing the code.
A few examples of why code needs reviewing:
The team is new and wants guidance on best practices.
The company is purchasing the code, and want to know if what they have is of suitable quality.
The team has changed, and no-one knows how the code works anymore.
The team has tried different approaches (for example, to error handling) and wants guidance on what to take forward as a standard approach.
The team wants to take a stronger functional programming approach to the code base, and would like suggestions on areas where they are struggling.
There’s a wide range of reasons there, and many of them are not really about the code. They are about team aspirations.
A customer might also have specific issues. They appear when you ask: “What are you concerned about?”. The most popular answer to that questions is “maintainability”.
Having figured out why we are reviewing code, we can ask the second question.
2. What is the code supposed to do?
Here I’m looking for a description of the problem being solved by some module. That is, when you’re building something there will be a few ways to go at it. The choice you make could be specific to the concerns you have, or might be arbitrary, or historic, or inherited. Usually trade-offs are made. The source code and documentation rarely tell you what is intended and what is incidental.
This gap between the code and the intention is a problem. As a reviewer I have to reconstructing these problems, and then the solutions, and rate them. And it is not only a pain for review, but also for any developer having to pick up the code in the future.
We can ask the team about the objective reality of the code (to use Peter Hilton’s phrase), but there’s so much of it we often end up deducing as much as we need from sampling the execution path of the code.
3. Sample an execution path
Here we’re looking at the way the code flows, figuring out the layers of software. Where do the inputs come from? What are the outputs? What’s the path between look like?
For a web API, this would be working through endpoints. For a distributed system, it means chasing messages. It could also be the code paths executed by tests.
This is where most of the work is done. We’re seeing the style of the code, the idioms in use, the common patterns. It’s where we spot opportunities for alternatives and improvements. Most recommendations from the review report will be generated in this section.
4. It’s all about the people
That third step generates questions about the approach taken, the context of decisions made, and ultimately ties back to the “why” of the review.
What’s the context for this solution? What requirements have the team been working to? How have issues been communicated to the team? How does the team work? What’s gone on in pull requests? Are things running as smoothly as they should be?
This social side of software is as important as the code. So having the team, and the code history, available helps greatly in unpicking what’s going on.
It’s also important to acknowledge that review is not about bashing the code or the developers. In the heat of day-to-day coding we produce all sorts of solutions to ship the code. Shipping working product is the game, and I have every respect for the process developers are going through. They are getting to grips with the problem (and sometimes the domain) for the first time, finding a solution, and getting into the hands of users. It’s easy to come to a code base after the fact and say “I wouldn’t have done it like that”. Or to ask: “where are the ADRs?”. That ignores all the learning that’s gone on creating that code.
The review is about understanding the context, concerns, and needs of the project and helping it move in the right direction.
5. Automated analysis
Automated tools have their uses. WartRemover, Scapegoat, turning on a good set of compiler flags, running code coverage… they give you an additional sanity check on the code. But I’ve put them at the end of this list, because they are of lowest value compared to the context you get from the why, what, and social aspects of a project.
When reviewing a whole code base:
- Ask why are we reviewing the code. What is important to the business or team?
- Figure out what is the code supposed to do. What problem is a module solving, and what approach have we taken?
- Follow execution paths to get an understanding of how the code structure and flow
- Talk to the people involved, get a feel for how the work is carried out, and identify gaps.
- Run automated analysis as an additional sanity check.
What techniques do you use when exploring new code? Use the comments below to share your tips for reviewing a code base.