This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
[In this Intel-sponsored feature, part of the Gamasutra Visual Computing microsite, Goo! developer Tommy Refenes of PillowFort matches wits with multi-threading in four gripping acts -- and emerges victorious]
I'll admit it. I'm a speed freak. (I tried to get help, but it turns out those clinics by the train station are for something totally different than my addiction.) Simply put, I like what I write to run fast, getting a certain amount of personal joy when something I've written runs at over 100 FPS.
I've always strived to make my code as efficient as possible, but I didn't realize how little I knew until my previous job. Back then I was an Xbox 360 engine programmer at a studio working crazy hours to get a game up on XBLA. I was thrown into the world of memory management, cache optimizations, rendering optimizations, and briefly touched on the land of threading.
In May 2006 I left that company to start working on Goo!, a game in which you control big globs of liquid and use those globs to envelop enemy goos. Players have total control over their goo, which means a ton of collision calculations and even more physics calculations.
My new job, once again, was an engine programmer, and I knew that to have real-time interactive liquid rendering on the screen at a speed that wouldn't give me FPS withdrawal, I would need to thread the collision and physics to the extreme.
(A small disclaimer: Goo! is my first attempt at multi-threaded programming. I've learned everything I know about threading reading MSDN help, a few random Intel documents, and various other resources on threads. I didn't have the fancy pants books or money to buy said books. So in this article I may miss a few official terms and concepts and I may be totally wrong about some things, but like all programming, it's all about results.)
Goo! began life on a Intel Pentium 4 processor 3.0 GHz with Hyper-Threading Technology. Goos are rendered by combining thousands of small blob particles on the screen to make up a height map, which is then passed through filters to make the goos look like liquid or mercury. That's how they are rendered now, but in the beginning they were hundreds of smiley faces.
Back then, Goo! rendered and calculated around 256 blobs on the screen. Blobs' collision calculations were culled based on velocity and position, but it was all pushed through the Intel Pentium 4 processor and needless to say the results weren't great.
With the game rendering at around 20 FPS I knew it was time to thread the collision detection. I broke down and purchased an Intel Pentium D 965 Extreme Edition processor. The dual 3.73 GHz cores would be more than enough to start building a new multi-threaded collision detection system. Like a five-year old back in 1986 with a brand new Optimus Prime, I was excited and anxious to get started on the new collision engine.
So began my first attempt at threading. I adopted a philosophy inspired by Ron Popeil (of Ronco fame): "Set it and forget it." Logically data that isn't needed immediately is a perfect candidate for a threaded operation. Collision detection fell into this thinking very nicely because most of the time collision results will be the same for several frames before changing. Therefore, I reasoned, collision could run on a separate thread and be checked every frame to see if the routine had finished, copying over the results when finished, and using them in the current frame's physics calculations.
At the time, this seemed like a great idea. Not waiting for collision detections to finish every frame allowed the engine to move on and continue crunching physics and gameplay calculations with old data, while still maintaining a pretty high frame rate. At this point, Goo! was pushing through around 625 blobs rendering at about 60 FPS.
The threading model was simple: A collision thread was created when the game loaded, sleeping until it had something to work on. The main thread copied position and size data for every blob into a static array the thread could access and then continued on to physics calculations and rendering. The collision thread woke up, saw it had work to do, and proceeded with a very brute force method of figuring out who's touching who. It then saved this data into an array of indices that corresponded to every blob in the game.
Once finished, it posted a flag to the main thread. When the main thread saw that the collision thread had completed its calculations, it copied the newly calculated collision data over the old data, copied new position and size data, and once again sent it to the collision thread to calculate. At this time, each collision calculation took around 20 ms, which meant that collision data was updated just about every 1.5 frames.
With 625 blobs on the screen rendering at around 60 FPS, gameplay could now start development. This is where the first problems began to hit. The goos themselves didn't really feel like goos. When 625 blobs represent one goo it looks fine, but Goo! is a game where you battle against other goos so having more than one on the screen made the goos look a little chunky.
Chunky goos (unless you are talking about GooGoo Clusters) are no good. The solution: more blobs! I expanded the threading model by adding another collision thread and splitting up the work over the two threads. The main thread now broke the collision work up into two pieces, telling collision thread 1 to calculate the first half of the data, and collision thread 2 to calculate the second half. Once the two threads finished, they sent two flags and waited for the main thread to see that data, copy it over, and send more work. This method yielded some better results, allowing around 900 blobs to be rendered to the screen at 60 FPS.
With 900 blobs on the screen goos started looking like goos, and it was time to put the collision to rest for a while and focus on rendering. But as gameplay development progressed, various little unexplained bugs started to pop up: Memory would appear as if it were being accessed after being freed, blobs wouldn't change ownership properly, and the simulation would occasionally explode. Although these were gamebreaking bugs, they were so infrequent that I didn't bother with them until about seven months later after some core gameplay was flushed out.