1. Garbage Strike.
In retrospect, I feel I waited too long to get Weapon of Choice to run on the Xbox 360 (mid-June for a November release). I went into it simply not knowing exactly what I was getting into and overestimated how much the Xbox would enhance the game's performance.
With XNA on the PC, a generational garbage collection system is used. This basically means it's smart about how and when to collect garbage.
As it turns out, the system on the Xbox 360 is non-generational, which basically means any little thing that needs garbage collection will slow down the game. When I first got Weapon of Choice running on the Xbox 360, the game's smooth PC framerate speed devolved into a slideshow.
It took months to deal with all the garbage collection issues and get the framerate up to a playable speed. I didn't do my homework on the target platform and it hurt badly, even though documentation existed.
Figure 10. Wrap Mouth in Object Editor in "Stage 2" pose. Initially, loading animations like this took over two minutes of wasted time per level due to garbage collection.
If I were to do it all over again, I would have the game's prototype running on the 360 and use the XNA Framework Remote Performance Monitor from the start, to ensure garbage collection was not an issue.
2. My Pipeline Doesn't Look Like Your Pipeline.
As stated before, I developed custom animation and modeling software for creating the game art. I also wrote my own file managing system, even though the XNA Content Pipeline was created to manage assets. (Maybe I have some control issues?)
This worked fine on the PC, as I was able to subvert the Content Pipeline and load everything with my own system. However, my custom asset pipeline was rendered useless when I finally converted the game to the Xbox 360, as XNA only allows assets loaded through the Content Pipeline.
I ended up keeping my custom loading code and simply wrapped it with special code to pass it through the Content Pipeline.
As it turned out, this special code incurred many garbage collection hits since, in .NET, strings are immutable, and with each new string, more memory was allocated. I wrote a special character parsing function to load the data (not using strings and incurring no new memory allocations).
This resulted in other issues that required a patch after launch; my parser failed to handle the idiosyncrasies of some languages which swap the comma for the decimal in floating point numbers.
3. On-the-Job Training.
Before this project, there were many things, such as rigging models and making sound effects, which I'd never tried before. At the top of that list was programming a multi-threaded game update loop.
I changed the game loop to utilize threading too late in the development, which created hundreds of crash bugs, stemming from code in which two threads would access the same static data.
I mainly operated from a "Use Threading Because It Will Make Things Faster" mindset, rather than confidently designing code to enhance the runtime performance. During profiling, I could toggle between single and multiple processor use on the Xbox.
This showed that threading provided around a 15% speed-up. This was good, but it definitely never reached the magical 60fps level.
Figure 11. Level Editor view of a first level boss showing only the mid-ground layers. These layers contained the fighting and generated the most processor work.
The Xbox 360 has six hardware threads. In XNA, thread indices 0 and 2 are reserved leaving developers with four useable threads. I used thread indices 1, 3 and 4 for gameplay updates. Because I put loading on a separate thread so late in the project, I simply assigned loading to thread index 5, rather than try to dynamically switch thread functionality.
In my system, as the gameplay threads are initialized, level objects are assigned to different thread indices. For example, objects 1-1,000 go to thread index 1, objects 1,001-2,000 go to thread index 3, and the rest go to thread index 4.
The background-to-foreground, layered nature of the level design, however, created a situation in which one thread (index 4) would be given the layers with all the action and collision detection requests. I'm pretty sure clumping work onto one thread can retard the benefits of multi-threading.
I recognize that putting collision on one thread, general updates on another, and animation on a third makes sense, but after several attempts, I never got that method to work properly. My "Gosh, I Hope This Helps Some" approach to multi-threading probably explains the meager speed increases.