The Team Collaboration Metrics are designed to give managers a birds-eye view of how a team is behaving in the code review process. These metrics illuminate knowledge concentration centers, code that may require your attention, and bottlenecks to a pending release.
There are six metrics that comprise the Team Collaboration Metrics:
- Avg. Time to Resolve: The average time it takes to close a pull request.
Appears in: PR Resolution
- Avg. Time to First Comment: The average time between when a pull request is opened and the time the first reviewer comments.
Appears in: PR Resolution
- Avg. Number of Follow-on Commits: The average number of code revisions once a pull request is opened for review.
Appears in: PR Resolution
- Recent PR Activity Level: A measure of how active a PR is on a scale of Low, Modest, Normal, Elevated, High, as calculated by the comment count, word length, and recency of the comment.
Appears in: PR Resolution
- Recent Ticket Activity Level: A measure of how active a ticket is on a scale of Low, Modest, Normal, Elevated, High, as calculated by the comment count, word length, and recency of the comment.
Appears in: Work Log
- Knowledge Sharing Index: Measures how broadly information is being shared amongst a team by looking at who's reviewing who's PRs.
Appears in: Knowledge Sharing
These metrics are designed to promote healthy collaboration and provide prescriptive guidance to improve the productivity of the team’s code review process as a whole.
As with any data point, these metrics should be used in context. “what’s right,” and “what’s normal,” will vary depending on your team’s situation.
Average Time to Resolve
How long, on average, does it take to resolve an open PR?
Average Time to Resolve is the average time it takes to close a group of pull requests where resolution can be approved, rejected, or withdrawn.
As a manager, you just want to drive this number down. pull requests are a period of review. They should open, there should be discussion, perhaps some follow-on commits, and they should close.
Long-running PRs are something you should manage relatively aggressively. If you notice a long-running PRs, you can use the PR Resolution report to identify the root cause and take action accordingly.
Average Time to First Comment
On average, how long does it take for a reviewer to comment on a PR?
Average Time to First Comment is the average time between when a pull request is opened and the time the first reviewer comments.
This is another metric that, as a manager, you want to drive down. When someone opens a PR, you want to see people getting feedback in early. It speeds up the entire process that follows: responding to comments, incorporating feedback, getting those changes reviewed, and so on.
Keeping this number low is a team effort. You can encourage the team by ensuring them that it doesn’t matter who gets assigned to the PR, it matters that the team acknowledges the open PR and gets on it as soon as possible.
Average Number of Follow-on Commits
On average, how many follow-on commits are added to an open PR?
Average Number of Follow-On Commits is the average number of code revisions once a pull request has been opened for review.
This is an excellent gauge of the strength of code review process. If nothing ever changes because of the review process, then why are we doing it? If every PR yields a lot of follow-on commits, then perhaps more planning and testing is in order.
Generally speaking, this is a Goldilocks number. Meaning, there’s a happy middle ground based on your team’s culture. You want to manage this number as it changes, rather simply up or down. Large or sustained changes in this average suggest a change in the team’s dynamic that should be investigated.
Recent PR Activity:
How many comments and follow-on commits are happening on a given PR?
Recent PR Activity Level is a measure of how active a pull request is on a scale of: Low, Modest, Normal, Elevated, High. The measure considers the volume of commenting and follow-on commits in a pull request.
Recent PR Activity has a built-in decay function, so older comments are scored lower than recent comments.
This metric gives managers a way to gauge how much chatter is happening around a PR without having to read the actual comments of the PR. This is particularly useful coupled with Time to Close. A long-running PR coupled with Elevated activity suggests disagreement or uncertainty and often warrants a manager’s attention.
Generally speaking, you’re looking for outliers. When you find long-running, low activity PRs you want to nudge folks to approve or kill the PR. PR’s that have a lot of activity should be reviewed carefully for both tone (is the discussion degenerating down into an argument) and content (is the discussion driving toward resolution or simply going in circles).
Recent Ticket Activity
How many recent comments are on a ticket and how lengthy are those comments?
Recent Ticket Activity is a measure of how active a ticket is, as calculated by the comment count, word length, and recency of the comment.
Recent Ticket Activity has a built-in decay function, so older comments are scored lower than recent comments.
As concepts, Ticket Activity and PR Activity very similar: both look at where people’s attention is going in the review process.
This metric gives managers a way to gauge how much chatter is happening around a ticket without having to read the actual comments of the ticket. This is particularly useful coupled with Time to Close. A long-running ticket that also shows “Elevated” activity suggests disagreement or uncertainty and often warrants a manager’s attention.
Generally speaking, you’re looking for outliers. When you find a ticket with a lot of activity late after an engineer began implementation, you should review it carefully for potential problems that could impact timely delivery.
Notably, you want to watch for tone (is the discussion degenerating down into an argument), content (is the discussion driving toward resolution or simply going in circles), and most importantly, material scope creep.
Tickets usually represent the source of truth for what is to be built. You should expect lots of comments prior to implementation when a feature is being designed or a requirement is being documented. However, after implementation begins, you should expect very few comments with only minor clarifications in requirements, normal status changes, etc. When you see Ticket Activity creep up, it’s often a sign that a manager can add value by intervening to make sure the ticket stays on track.
Knowledge Sharing Index
How many people are reviewing everyone else’s work?
How is the responsibility of reviewing shared by the team?
How much information is being passed around by the team?
Knowledge sharing happens when engineers are familiar with multiple areas of the code base. There are two main ways to increase your team’s knowledge sharing. Engineers can work on multiple areas of the code base, or engineers can review each other’s code.
Since it’s not always practical for an engineer to work on multiple areas of the code base, reviewing code is a valuable opportunity to engage with different areas of the code base. This allows engineers to know more parts of the code and ensures no one person is a knowledge silo for a particular part of the code.
The Knowledge Sharing Index measures how thoroughly information is being shared amongst a team via code review. Use the Knowledge Sharing Index to view which engineers are reviewing others’ pull requests as well as who owns those pull requests. This helps you understand if you are increasing the number of people participating in code review or if only a few individuals are reviewing code.
The knowledge sharing index measures knowledge sharing on a scale of 0-1.
- 0 represents the least possible knowledge shared (one person is doing all the reviews).
- 1 represents an even distribution of the work.
We suggest aiming to be at or above 0.6. If you are below 0.3, you should make knowledge sharing a focus for the team.
Flow uses a variation of the Gini Coefficient to calculate the sharing index. The Gini Coefficient comes from the field of economics and is commonly used to measure the wealth distribution in a society (when you hear TV pundits say things like “the top 1% owns 40% of the wealth of a country” they are resting on this interesting field of research.)
In our case, think of a PR comment as a unit of wealth. In a perfect world, everyone performs an equal number of reviews, thus the wealth is evenly distributed. Alternatively, if one person performs all the reviews, he has hoarded all the knowledge available via reviews.
In practice, PR reviews are not the only way to share knowledge; and, as in life, there are other forms of wealth that aren’t considered in a Gini Coefficient. However, we have found that the knowledge sharing index makes for a management good signal. You can use it to help identify engineers who are not part of the everyday flow of reviews as well as find silos and other anti-patterns that often bubble up simply because they are hard to spot. The Knowledge Sharing Index adds a nice tool to help your team collaborate and share knowledge via reviews.
Look for outliers and stranded engineers then take action to help get them involved. Use the Knowledge Sharing Index to manage your team’s broader trend toward or away from that imagined ideal of shared knowledge, keeping in mind that perfect distribution is rarely achievable nor is it desired.
The Knowledge Sharing Index coupled with Involvement (a Review Metric) will give you a very clear picture of how your team is using the review process to share knowledge. Involvement is a measurement of the percentage of PRs that an individual participated in, and Knowledge Sharing is the analogous measurement as a team.
If you need help, please email firstname.lastname@example.org for 24/7 assistance.