There's an old joke that goes something like this:
Patient: "Doctor, it hurts when I do this."
Doctor: "Then stop doing it."
Funny, but are these also wise words when applied to the right situation? Consider the load of pain I found myself in when working on a conversion of a 3D third person shooter from the PC to the original PlayStation.
Now, the PS1 has no support for floating point numbers, so we were doing the conversion by basically recompiling the PC code and overloading all floats with fixed point. That actually worked fairly well, but were it fell apart was in collision detection.
The level geometry that was supplied to us worked reasonably well in the PC version of the game, but when converted to fixed point, all kinds of seams, T-Junctions and other problems were nudged into existence by the microscopic differences in values between fixed and floats. This problem would manifest itself by the main character (called "Damp") simply falling through those tiny holes, into the abyss below the level.
We patched the holes we found, tweaking the geometry until Damp no longer fell through. But then the game went into test at the publisher, and suddenly a flood of "falling off the world" bugs were delivered to us. Every day a fresh batch of locations were found with these little holes that Damp could slip through. We would fix that bit of geometry, then the next day they would send us ten more. This went on for several days. The publisher's test department employed one guy whose only job was to jump around the world for ten hours a day, looking for places he could fall through.
The problem here was that the geometry was bad. It was not tight and seamless geometry. It worked on the PC, but not on the PS1, where the fixed point math vastly magnified the problems. The ideal solution was to fix the geometry to make it seamless.
However, this was a vast task, impossible to do in the time available with our limited resources, so we were relying on the test department to find the problem areas for us.
The problem with that approach was that they never stopped finding them. Every day bought more pain. Every day new variants of the same bug. It seemed like it would never end.
Eventually the penny dropped. The real problem was not that the geometry had small holes in it. The problem was that Damp fell through those holes. With that in mind, I was able to code a very quick and simple fix that looked something like:
IF (Damp will fall through a hole()) THEN
Don't do it
The actual code was not really much more complex than that (see Listing 2).
Listing 2: Meet My Dog, Patches
damp_old = damp_loc;
damp_loc = damp_old;
In one swoop a thousand bugs were fixed. Now instead of falling off the level, Damp would just shudder a bit when he walked over the holes. We found what was hurting us, and we stopped doing it. The publisher laid off their "jump around" tester, and the game shipped.
Well, it shipped eventually. Spurred on by the success of "if A==bad then NOT A", I used this tool to patch several more bugs -- which nearly all had to do with the collision code. Near the end of development the bugs became more and more specific, and the fixes became more and more "Don't do thispreciseandexacthing" (see Listing 3, actual shipped code).
Listing 3: Meet My Dog, Patches
if (damp_aliencoll != old_aliencoll &&
StartArena == 6 && damp_loc.y<13370)
damp_loc.y = damp_old.y; // don't let damp ever
touch the door.. (move away in the x and y)
damp_loc.x = damp_old.x;
damp_aliencoll = NULL; // and say thusly!!!
What does that code do? Well, basically there was a problem with Damp touching a particular type of door in a particular level in a particular location, rather than fix the root cause of the problem, I simply made it so that if Damp ever touched the door, then I'd move him away, and pretend it never happened. Problem solved.
Looking back I find this code quite horrifying. It was patching bugs and not fixing them. Unfortunately the real fix would have been to go and rework the entire game's geometry and collision system specifically with the PS1 fixed point limitations in mind. The schedule was initially aggressive, so we always seemed close to finishing, so the quick patch always won over the comprehensive, expensive fix.
But it did not go well. Hundreds of patches were needed, and then the patches themselves started causing problems, so more patches were added to turn off the patches in hyper-specific circumstances. The bugs kept coming, and I kept beating them back with patches. Eventually I won, but at a cost of shipping several months behind schedule, and working 14 hour days for all of those several months.
That experience soured me against "the patch." Now I always try to dig right down to the root cause of a bug, even if a simple, and seemingly safe, patch is available. I want my code to be healthy. If you go to the doctor and tell him "it hurts when I do this," then you expect him to find out why it hurts, and to fix that. Your pain and your code's bugs might be symptoms of something far more serious. The moral: Treat your code like you would want a doctor to treat you.
- Mick West