Review metrics provide software teams insight into what’s happening in the code review process.
Note: These metrics are only available in Flow Enterprise Server. Learn more about other pull request and collaboration metrics.
In this article
Review metrics overview
Review metrics include:
- Reaction time is the average number of hours it takes for a reviewer to respond to a set of PRs as a reviewer.
- Involvement is the percentage of pull requests that a reviewer participated in.
- Influence is the ratio of follow-on commits made after a reviewer commented.
- Review coverage is the percentage of hunks commented on by a reviewer.
These metrics are designed to promote healthy collaboration and provide prescriptive guidance to improve the productivity of a team’s code review process as a whole.
As with any data point, these metrics should be used in context. What’s right and what’s normal will vary depending on your team’s culture.
Reaction time
Note: The calculation of Reaction time in Flow Cloud differs slightly from the calculation in Flow Enterprise Server.
Reaction time helps you understand how long it takes for PRs to get reviewed.
Reaction time is the average number of hours it takes an individual or team to respond to a set of PRs as a reviewer.
The goal is to drive Reaction time down. A low reaction time generally means people are responding to each other in a timely manner, working together to find solutions to problems, and getting the product to production.
Ideally, you want to respond to people addressing you directly within about an hour. More than eight hours is usually counterproductive.
However, Reaction time is context dependent. There are times when it may be inappropriate to stop working, like when an engineer is in the zone, in a meeting, or handling an outage.
Involvement
Involvement shows how involved each person is in reviews and gives insight into whether some people are more involved than others in the review process.
This number will change according to your view. If you’re looking at someone's home team, they may show 75% involvement, indicating they reviewed three out of four PRs. But if you zoom out to view the whole organization, that same individual’s involvement rate will be much lower.
Involvement is context dependent. Not everyone can review everyone else’s code. For example, you wouldn't make an HTMLer review a complex query optimization. Architects and team leads are usually expected to have more Involvement to ensure consistency.
However, you should find a Goldilocks zone (external site, opens in new tab) for each individual and the team they’re on and manage significant or sustained changes to their Involvement.
Review collaboration shows Involvement for groups of individuals, organizations, and teams. The calculations for these are worth a special mention.
For a team, Involvement helps you understand how often the team reviews their teammates PRs as opposed to outsourcing to another team. Having extra people, like architects, review a PR is great, but you don’t want it to replace team reviews.
This metric is helpful if you have teams set up by role. For example separate teams for Software, DBA, QA, Front end, and Back end, but your project teams are cross-functional. You likely want someone from the organizational team to review a PR for standards and best practices and someone from the project team to make sure the work meets business objectives.
Setting up an organizational team structure and a project team structure and making sure that both have a high Involvement is a good way to accomplish this.
How team Involvement is calculated:
[# PRs reviewed by a team member]/[# reviewed PRs]
For your entire organization, Involvement measures how often someone in the organization reviewed the PRs. Once your teams are set up properly and everyone in your organization is on the team, Involvement becomes:
[# reviewed PRs]/[# total PRs]
This equation is the inverse of Unreviewed PRs.
Influence
Influence shows you how often people update their code in response to reviewer comments.
Influence is the ratio of follow-on commits made after a reviewer posts a comment. It’s the sibling of Receptiveness. The Influence metric looks at whether your comments elicited a follow-on commit.
Influence doesn’t try to assign specific credit. No individual gets the credit for being influential. It’s the discussion itself that deserves the credit, so all participants in the discussion prior to the follow-on commit get Influence credit counted toward the metric.
In practice, there’s a middle ground with this metric: Low influence may be a signal that an individual isn’t making substantive comments. High Influence may be a signal that an individual is acting as a gatekeeper or a crutch.
Architects and team leads should have higher Influence metrics. Once you find the right level for each individual and team, manage significant or sustained changes as they could indicate a shift in the team dynamic that warrants a manager’s attention.
Review coverage
Review coverage shows how much of each PR has been reviewed.
Review coverage is the number of hunks in a PR commented on as a percentage of the total hunks in the pull request. A typical PR will contain multiple files and multiple edits, or hunks, on that file.
Like a teacher who puts tic marks on every page of a term paper to indicate they read it, a good reviewer will put a comment on the majority of the edits of a PR even if it’s a simple “LGTM”. In practice, 100% Review coverage is overkill.
As a manager, keep an eye on Review coverage. Provide guidance when coverage rises and falls at the individual or team level. Encouraging team members to take the time to review each change in the code instead of skimming it as a whole will drive this number up. Small changes in average Review coverage make a big difference.