Testing and Bug Tracking
By Jamie Fristom
July 14, 2003
My thesis is simple, common-sense, and has been said many times before: finish one part of your project before moving on to the next part. Finishing a part -- be it a feature, character, mission, level, whatever -- means getting it ready to go into the box. It doesn't mean you can never touch or polish that part again, it just means that if you had to ship it the way it was, you'd be willing to. Which means you have to fix bugs now, rather than put them off until later. Somehow we keep forgetting this. For instance, just a week ago at our leads meeting, we nearly forgot it again when we said, "Let's get each mission to alpha and move on to the next mission."
One reason we keep forgetting this is because of that phase of the project we call "alpha" or "feature complete." The fact that we have this phase implies that we're not supposed to fix our bugs now; we're supposed to get the features in, and then we'll fix all the bugs later in the alpha phase. This becomes exacerbated, because with each project we do when we're done we say, "The game wasn't as polished as we would have liked last time around. We need to devote more time to alpha." One side effect shrinking the full production portion of the development cycle has is that there's additional pressure to put off bug fixing until later.
We need to resist this pressure. We need to fix our bugs before moving on. Why?
Some people find this surprising. They say things like, "What? We can't fix the bugs now, we've got a milestone to hit!" or "People will put up with crash bugs for a fun game but will not put up with a game that isn't fun no matter how solid it is." or "If we start fixing bugs now, we'll never finish." That's our disease talking. For the most part we'd rather stick pins on our eyes than track down bugs, but we must.
There are a few reasons why it's important to fix bugs now rather than putting them off until after alpha:
If you try to keep the number of defects down you speed up your development cycle, and end up with more time to make more game. Even with a throwaway prototype you should fix bugs, for the same reason: you'll end up with a better prototype. It only takes a few days of ignoring bugs to turn your development process into a living hell. In short, an attitude of "fix the bugs now" is one way you can increase Bang for developer Buck.
Sometimes I think of making a videogame as an endless process of fixing thousands of bugs. Some of the bugs are stop shipment bugs but most of them are suggestions for how the game can be improved. We prioritize as best we can, fix as many as we can, and at some point we run out of time, make sure all the stop-shipment bugs are fixed (hopefully), mark the last of our bugs "Will Not Fix", and ship it.
Given a choice between scheduling software and bug-tracking software I'd take the bug tracking software. According to Peopleware by DeMarco and Lister, some (admittedly sketchy) studies have shown that programmers work more productively when they aren't scheduled. This may explain the feeling you get when you're in alpha and dozens of bugs get fixed every day. Everyone's trying to fix each bug as quickly as possible, instead of trying to do the best job they can do on a feature within the time allowed. And I think you can benefit from that sort of "ASAP" mentality, even at the beginning of a project, by getting the bug list up and running, and putting the small tasks and high-priority suggestions ("orcs are way too easy to kill") on there.
Techniques For Fixing Bugs Early
Now that I've gone over why it's a good idea to fix bugs first, let me go over some techniques for making that happen.
Commit to tracking bugs. The very first thing Greg John, my producer, and I did when we started our last four projects was get the bug database up and running. (To quote David Cook, the lead coder on Kelly Slater's Pro Surfer: "The first bug is - there is no game.")
Use an automated bug tracking system. If your team is larger than half a dozen people, you'll find that unless you have an automated way for people to submit bugs, you will be spending a ton of someone's time recording e-mails about bugs into the bug database.
We use an access database with a web-based front end that we created with Seven Simple Steps a few years ago and which we've been modifying and changing steadily. If you don't want to take the trouble to write your own, FogBugz from Fog Creek Software is a good product that didn't quite meet our needs three years ago but has since matured a great deal, and they provide source so you can tailor it to your own needs.
Don't rely on your publisher to do all of your testing. Do testing in-house. Either have a testing department, or, if you don't have the budget for one, you'll have to have people on the team do the testing. That's what I mean by production testing: the producers do it. Make a disciplined, exhaustive effort to find bugs and get them in the database instead of letting your bug list just be a dump for people to throw their occasional gripes in.
Generally, any asset or feature is going to need a lot of testing. The person who creates the asset or feature should test it before submitting it. Their lead should test it before signing off on it. In house testing should test before each milestone is submitted. The publisher should test it before it is sent to the console manufacturer. The console manufacturer tests it before releasing it.
Gameplay testing should be done as well; don't just answer the question "does it work" but also "does the player like it?" (Find out if you've done the right things, along with finding out if you've done things right.)
It's quite possible you work at a place where upper management isn't going to spring for in-house testing. All this means is that you have to do more testing yourself. Because we had no testing department handy, I, the lead programmer, spent about a third of my time every day playing Tony Hawk when we ported it to the Dreamcast. Rough job, eh? But if I hadn't done that, the game would have taken longer to ship. And even if we had in-house testing back then, I still would have had to spend some time testing, to make sure that features were being implemented correctly.
Autobuild. Rapid Development stresses the importance of a daily build and smoke test to make sure, each day, nobody's checked in anything so horrible that it breaks the game completely. On our current project, Toby Lael, one of our programmers, has put together something even more effective: a program that watches our source control, and as soon as a change is checked in, gets that change and does an incremental build of the data and/or source that's changed, making sure that it still works. If it doesn't work, an e-mail is sent out to the whole team, and whoever checked in the bad stuff will quickly act to make things right. This way, for most of the day, people know that they can safely get the latest stuff out of source control, and when it's broken, everyone knows it.
Zero-Defect Pushes. Unlike Steve Maguire, author of Writing Solid Code, I don't believe it's possible to always keep the master sources free of bugs. I have yet to meet anyone who succeeds at this noble task. One tenet of zero-defect development is that everyone stops working on their own stuff when there's a bug in someone else's stuff; they all pitch in to help, or at the very least they don't break anything more while that one person is muddling through. I think that on a large team someone is always wrestling with a bug. If everyone pitched in to help, it would kill our ability to work on multiple fronts in parallel, slowing our development to a crawl.
What I do believe is that it's possible to make a push for zero defects a few times in the life of the project. We don't have to wait until alpha.
Sometimes I think that if it wasn't for E3, most of the projects I've worked on would have failed. Although people complain about the time wasted on E3 making demo gameplay that's going to be scrapped anyhow, one thing E3 does is it forces us to try and get our bugs fixed; we make a zero defect push.
When we're in this mode, we do a number of things:
1) We lock access to source control. People aren't allowed to check in at will.
2) We have a check-in queue. We go through the queue one by one, allow people to check-in, test their work for several minutes, and go on to the next person. Although this means people are frequently idle while they wait for their turn to check in, the amount of time individuals spend idle tends to be about the same or less than the amount of time the whole team spends idle when the build gets broken.
3) When a new bug gets discovered, we halt the queue completely, not allowing anyone to check in until the bug has been found and fixed.
4) We don't allow low-priority fixes, new features, or cosmetic improvements. We try to only fix what truly embarrasses us, although there is some give and take between risk and importance. A texture change, or a game balance tweak (thug's kick is too effective) might get in, simply because it is deemed safe.
Note that what makes this process work is really the last step. On Chris Busse's team at Treyarch, steps one through three are a way of life, no matter what phase the product is in. My suggestion: try living with it for a while. I think although your team will be chafing at the bit to check in their work, they will have new confidence that the build is in a good state.
So why wait for E3 to do a zero-defect push? On our last project we did pushes a few times, when we needed to show the game to upper management. We never actually did hit zero defects with these pushes, but the efforts made the game that much more stable each time.
Should We Even Have An Alpha Phase?
Some have suggested we completely discard alpha phase, on the grounds that alpha encourages sloppiness earlier in the project. Why can't we slowly and steadily implement high-quality features all the way up to ship? I think that's a bad call, for a number of reasons:
So, given that we still have an alpha phase, how do we manage it? That brings me to the following rule: After feature freeze, use your bug-find and bug-fix rates to estimate your ship date.
Once in alpha, you may want to have some idea if you're going to be at "zero bugs" on time. (As I hinted before, "zero bugs" is a nebulous concept, as your criteria for what you consider a stop-shipment bug is going to gradually get more and more stringent as you get closer and closer to your ship date. It's almost as if, to quote Jim McCarthy's Software For Your Head, bug count is a constant.)
You might think you could take the entire bug list, ask everyone how long it's going to take to fix their bugs, and be done. You shouldn't do this, because:
Rather than attempting to schedule bug-fixing, Greg John uses this process on our projects. The way it works is each day you count how many bugs you have in your database, and make a chart. It should, ideally, be a curve that shoots up rapidly after the product goes into testing (and by testing I mean publisher-side testing), hits a peak, and then trails off towards an asymptote of zero. While the curve is still shooting up, the way to estimate your ship date is to make up a guess as to how many bugs you are going to have, total. You can do this by looking at your previous projects and extrapolating. If you don't have previous projects, now's a good time to start gathering this kind of data, but until you have the data, use Greg's rule of thumb: take the number of people-months that have gone into the project and multiply by ten. (In other words, every one of us introduces an uncaught bug every three days.) At your shop, your number will quite likely be different, depending on your process and your testing team. It could vary from under a thousand (LucasArts), to three thousand (Lionhead Studios), to eighteen thousand (us.)
Early Alpha: Bug Count Is Rising
At first, you're in what Chris Busse, producer of NHL 2K3, calls the "Bugs are like fruit on the ground" stage. In this stage, you can't play the game for more than a couple of minutes without hitting a stop-shipment bug. When the game is in this state, the testers aren't going to try to do tricky things to break the game, like force their avatars into tight crevices where they might drop out of the world, or find some way to throw the thug who has the key to the waterfall onto the other side of the waterfall. (This exact bug was revealed in our last project, after we shipped, by the guys doing the localization for Japanese. They have some good testers. They sent us a videotape. Thanks guys. Why don't you just give us paper cuts and rub lemon juice in them?) When you're in this stage, you are still at least a month from being done, and probably two months. A lot of developers start blaming the testers for doing a poor job at this point. I have been guilty of this sin. "Why aren't you guys finding the tough bugs? Why didn't you find this bug sooner?" The answer is because they were so busy writing down things like, "Game crashes when you try to punch thug," they didn't exactly have time.
During this phase you are ascending the bug-count curve: testing is finding bugs faster than you're fixing them. In this stage your resources -- the developer resources -- are the bottleneck. Some overtime should probably be mandatory during this period, as it's one of the only ways to bring the project in sooner. You may even scrounge up people from other teams at your company. And you can mark as many bugs "as designed" or "will not fix" as possible. If you're lucky enough to have the kind of guys on your team that care so much about the project that they implement their own features when nobody's looking, it's definitely time to stop that if you haven't already.
Late Alpha: Bug Count Is Falling
Once you're over the hump, and fixing bugs faster than you find them, it means two things. First, you need more testing, as now testers are the bottleneck. This is the time (okay, one of the many times) you yell and scream at your publisher, because you're doing all you can to bring the project in on time, and they are the ones holding you back. (Evil publishers may even have a completion-on-time bonus they don't want to give you if they don't have to, and will therefore give you just the right amount of testing to ensure that you complete just a week or two late. Or they can use the "this lame bug must be fixed" trick.)
Second, you can get an idea of when you're going to hit zero bugs by looking at the trajectory of the graph. You can see how closely this number relates to your previous estimate of how many bugs there were going to be. It's also a good idea to devote your own people in-house to testing, although it may take some work to train your idle artists and coders how to be good testers.
This is the "finding the hard bugs" stage. You are officially within striking distance of being done. You can start sending presubmissions to the console manufacturers. (And fix the slew of bugs that they report.) You are a few weeks from being done.
Some Call It Beta: Zero Bugs
Finally, you hit zero stop-shipment bugs. You're not done; you've only hit zero for the first time. This is the point where some publishers change their criteria as to what they consider a "stop-shipment bug". From here on, the publisher becomes much more stringent, letting cosmetic and gameplay bugs slide, as fixing a cosmetic bug always runs the risk of introducing a stop shipment bug.
This brings up story I like to tell about the first game I ever worked on, Magic Candle 2. Right before we shipped it, we discovered that the player could walk on a kind of foothill terrain they weren't supposed to be able to. When fixing the bug, we introduced a real bug: the player could walk on water but not sail on it, and the game shipped that way.
Greg John calls this the "firemen" stage. Each morning you come to work and the testing team has found half a dozen new bugs overnight. Most of these you will not fix ("WNF"), the rest you get fixed by mid-afternoon, and then you sit around and browse web sites and pray. Maybe you're getting two sets of bug reports a day. It's a good idea to give people time off, with the understanding that they are on call in case a bug crops up that only they can deal with. At this point, you are almost done, and as soon as you've gone some number of days without a bug report, you fire off submissions to console manufacturers. The number of days is up to you. It could be one -- which amounts to having the console manufacturer do your last round of testing for you -- or it could be as long as the console manufacturer is going to spend testing your project, in which case you can be pretty sure (but never one hundred percent positive) they won't find any bugs you haven't dealt with.
If push comes to shove, you can actually ship to the console manufacturer at any time during the beta stage. We did this on our last project so we could meet our commitment to retail, and we got lucky. I don't recommend it.
The Cold Hard Reality
I'm making it sound like once you're armed with these tools you never have to feel stress during those last few months again. Unfortunately, you can always be surprised. On our last project, we blew through our rule-of-thumb estimates for how many bugs there were going to be. After we thought we had gotten over the first hump, we asked for more testing, and we got it. The bug find rates climbed right back up again. Our open-bug graph ended up looking like the stock market. Times like these make you feel like you're a first-year game developer.
When Shouldn't You Fix The Bugs First?
I like to get on my high horse and spout general principles and rules of thumb and then frequently find myself violating those rules in the heat of battle. The "fix your bugs first" rule has its exceptions.
Be careful with these exceptions. They can quickly become excuses which allow you to let the product get in a sorry state. When you can, refactor legacy systems instead of replacing them. Cancel those risky features and fix bugs instead. And, sometimes, let people be idle; it's better to have a few idle employees than to compromise the project.
Where Do We Go From Here?
I'm feeling a little shame as I write this article because my brother has just been playing the last game I worked on and has already found three bugs that I would have considered stop-shipment. So you might be tempted to disregard this article. But imagine how buggy our product would have been if we hadn't taken these measures.
Still, obviously, these measures are not enough. Use them to fill in gaps in your own testing regimen; use them as a jumping off point. But they are not a cure-all.
The introduction of Sun Tzu's Art of War goes like this: "According to an old story, a lord of ancient China once asked his physician, a member of a family of healers, which of them was the most skilled in the art. The physician, whose reputation was such that his name became synonymous with medical science in China, replied, 'My eldest brother sees the spirit of sickness and removes it before it takes shape, so his name does not get out of the house. My elder brother cures sickness when it is still extremely minute, so his name does not get out of the neighborhood. As for me, I puncture veins, prescribe potions, and massage skin, so from time to time my name gets out and is heard among the lords.'"
These days, what people really care about is the healthy patient, not whose name is heard where, so for us, there's something even better than fixing bugs before moving on to new features: create your game in such a way that bugs are not introduced in the first place. That's where my advice comes to an end, because while I am good at dealing with the problems I create, I do not yet seem to have the knack or knowledge to prevent these problems in the first place.
Still, even if you are the sort of game developer who removes the spirit of sickness before it takes shape, the techniques I have listed here can still be a valid part of your toolkit: a safety net for when the other measures fail.
Debugging The Development Process and Writing Solid Code by Steve McConnell - discusses the importance of getting bugs fixed first.
Development by Steve Maguire - discusses the daily build and smoke test.
Peopleware by DeMarco and Lister - discusses scheduling and flow.
The Goal by Eliyahu Goldratt and Slack by Tom DeMarco - why an idle employee is not the sin that upper management may think it is
Critical Chain by Eliyahu Goldratt and Waltzing With Bears by DeMarco and Lister. Discuss the safety buffer, the project management equivalent of what we call alpha.
Game Development and Production by Eric Bethke. Discusses a unique way to estimate project status during alpha that goes beyond what I've described here.
Copyright © 2003 CMP Media Inc. All rights reserved.