We all like metrics as a way to measure performance and measure trends. When it comes to checking the performance of your QA team, or individuals within the team, how do you go about measuring that? I don't have the answer at hand but in this post I want to discuss why Bugs / Issues / Defects are not an effective way to measure performance of QA.
We don't compare coding performance on volume of lines wrote, so why do this for quantity of bugs entered?
If you or your team are using bug volumes, quotas, bug scoring then you are not promoting quality as you're unduly focusing on process / admin. Here is why you should stop monitoring QA teams bug quota output during the development phase and stop using it as a metric to justify the teams output.
Good developers can hide QA ability
Defect finding is a team effort and reflects the combined talents of the developer, the teams pipeline and the skillset of the QA'er. With it being a combined effort, how would you begin to measure a QA'ers performance by bug volume? We don't compare coding performance on volume of lines wrote, so why do this for quantity of bugs entered? Take for example a talented developer paired up with an talented QA'er. There won't be a lot of defects found but all things being equal, this would be true if the developer was paired with an inexperienced QAer due to the developer "shielding" them by having solid code. Additionally, an inexperienced developer paired with the inexperienced QA'er might not result in a lot of issues being found during the testing phase but are later realized when it goes live. In these situations, drawing performance parallels is not a useful metric.
Defect Quota's just drive up Admin
The focus on defect volume can shockingly lead to an expectation to get x amount of bugs per week. A silly aspect to this is that targets don't tend to adjust downward towards launch when the build should be getting tighter. All this does is stress out your team as they snipe each other for legitimate bugs, dilute catch-all bugs into individual expressions/symptoms of the bug and generate a culture of testers not working collaboratively since they can't share the credit.
Using defects as a reflection of the performance / quality of the game, rather than the performance of the individual, you move one step closer to having a higher quality product
The QA leadership can weight in and stop the team "gaming" the system but this creates animosity and adds extra admin. This is all time taking away from testing. But, but I hear you shout, "how can I tell that the testers are pulling their weight?" Well, how does any other manager rate colleagues output and contribution? It's usually a multifaceted approach, so If you're relying on one single (dubious) number then you need to rethink what you actually value from your QA team members.
Defect scoring pitfalls
Also, quantity is a non starter because some bugs have more value that others, right? Wrong. For those who haven't experienced bug scoring, it's where different types of bugs are given points based on what that team holds dear. An example, text bugs might score low but a crash would score highly. These scores would then be calculated to see who has been finding the more "valuable" defects. The problem is that defects reported is not indicative of reporter performance. If there is a crash on the golden path, it's the fastest fingers whom get it into the database that gets the credit and that doesn't require any special talent. In contrast, the team member who finally cracks the 100% repro steps on that weird crash bug is a hero however in such as system is rewarded the same as the person who points out the obvious crash. Again, this scoring involves admin, refereeing and a honour system that takes time away from actual testing.
Let's keep defects to what they are
Defects in the code are just identified issues that need to be fixed. They shouldn't be coming to developers with an attitude attached or passing judgement. Equally, the volume of assigned / cleared issues isn't the prime factor in determining how good or bad a developer is. Just like on the reporting side, there are lots of reasons why developers have issues assigned to them and various reasons for the rate of closure and fix rates.
The Positive Case
All throughout this post, I have bemoaned the usage of using defect counts as a primary indicator of performance. Ironically, if you don't place performance emphasis on defect volumes then the data-set is quite rich as you've not poisoned the pool with people attempting to game the system. From there, you can conduct valuable analysis of defect type trends, heatmaps of where issues lie within a product and comparison of missed Live issues vs detected issues. By using defects as a reflection of the performance / quality of the game, rather than the performance of the individual, you move one step closer to having a higher quality product and not a highly strung QA team.