Age of Empires II: The Age of Kings (AoK), a tile-based, 2D isometric, real-time strategy game, was built on the code base used in the original Age of Empires (AoE) and extended in its Rise of Rome expansion pack. In these games, players guide one of many civilizations from the humble beginning of a few villagers to an empire of tens or hundreds of military and non-military units, while competing against other human or computer-controlled opponents in single or multiplayer modes.
This is the first of a two-part article that describes the tips, tricks, tools, and pitfalls that went into raising the performance profile of Age of Empires II: The Age of Kings. All of the techniques and tools used to measure and improve AoK are fully capable of improving the performance of other games.
Beginning the Diagnosis
In some ways, the AoK development team was fortunate because we had the benefit of an existing code base to work with. Many performance improvements went into AoE, including extensive optimization of its graphics drawing core, and this work gave us a good starting point for AoK.
Still, a significant amount of new functionality was added over the course of the sequel's two-year development cycle. This new functionality, as well as new requirements placed on existing functionality, meant that there was a large amount of new work to do in order to meet the minimum system requirements for shipping AoK. As such, a dedicated performance improvement phase began in April 1999 to ready AoK for its September 1999 release. The purpose of this phase was to identify and resolve the game's remaining outstanding performance issues, and to determine whether AoK would perform well on the intended minimum system configuration.
AoK's graphics pipeline added new features to AoE's original system, implemented with a combination of C/C++ and hand-coded Assembly.
Our team had some ideas as to which parts of the code were taking a long time to execute, and we used Intel's VTune, NuMega's TrueTime, and our own profiling code to verify these hunches and see exactly where time was being spent during program execution. Often these performance results alone were enough to determine solutions, but sometimes it wasn't clear why the AoK code was underperforming, and in these cases we analyzed the data and data flow to determine the nature of the problem.
Once a performance problem is identified, several options are available to fix it. The most straightforward and recognized solution is direct code optimization by optimizing the existing C code, translating it to hand-coded x86 Assembly, rearranging data layouts, and/or implementing an alternative algorithm.
Sometimes we found that an algorithm, though optimal for the situation, was executing too often. In one case, unit pathing had been highly optimized, but it was being called too often by other subsystems. In these cases, we fixed the problem by capping the number of times the code could be called by other systems or by capping the amount of time the code could spend executing. Alternately, we might change the algorithm so its processing could occur over multiple game updates instead of all at once.
We also found that some functionality, no matter how much we optimized it, still executed too slowly. For example, supporting eight players in a game required too much processor time on the minimum system, so we specified that the minimum system could support only four players. We presented scalability features such as this as facets of game play or as options that players could adjust to their liking. These scalable features ultimately allowed AoK to run well on its stated minimum system, providing incentives or rewards to users who have better computers.
And then there were AoK's approximately 30 single-player scenarios. We evaluated the performance of these scenarios slightly differently from other game functionality. Instead of trying to optimize offending code, we first examined the scenario for performance problems that had been inadvertently introduced by the scenario designers in their construction of the scenario and its elements. In many cases, performance improved significantly with slight changes to the scenario, for example, reducing the number of player units, shrinking the game map, or making sections of the maps inaccessible to players. Above all, we made sure that we did not change the designer's vision of the scenario as we optimized it.
Shopping for Old Hardware
One of the goals of AoK was to keep the system requirements as low as possible. This was necessary in order to reach the broadest audience possible and to stay on the same incremental processor performance ramp set by the original Age of Empires and its Rise of Rome expansion pack. Our overriding concern was to meet these minimum system requirements, yet still provide an enjoyable game experience.
The original Age of Empires was released in September 1997, and required a 90MHz Pentium processor with 16MB RAM and a 2D graphics card capable of handling 8-bit palletized color. The Rise of Rome expansion pack shipped a year later and raised the minimum system processor to a 120MHz Pentium. Based on this information, the AoK minimum processor was pegged as a 133MHz Pentium with 32MB of physical RAM (Figure 1). The additional RAM was required due mainly to the increased size and number of graphics and sound files used by AoK. There was also a greater amount of game data and an executable that grew from approximately 1.5MB for AoE to approximately 2.4MB for AoK.
To make sure AoK worked on the minimum system, we had to shop for old hardware. We purchased systems matching the minimum system specification from a local system reseller - we no longer used systems that slow. When the "new" computers arrived, we decided not to wipe the hard drives, nor did we reinstall software and hardware with the latest driver versions. We did this because we expected that most AoK users wouldn't optimize their computer's configuration or settings, either. Optimizing these systems would have undoubtedly improved our performance numbers, but it would not have translated into true performance gains on other minimally-configured computers. On the other hand, for normal in-house play-testing, we used computers that were significantly more powerful than the minimum system configuration, which made up for performance issues caused by unoptimized code and enabled logging functions during play-testing (Figure 1).
Figure 1. AoK minimum PC and play-test PC system configurations.
Minimum System Spec Test PC
Ensemble Play-Test PC
|133 MHz Pentium Processor||450 MHz Pentium II processor|
|S3 Virge 2MB graphics card||Nvidia TNT 8MB graphics card|
|32 MB RAM||128 MB RAM|
|Windows 98||Windows 98|
set by the original Age of Empires was the use of options and
settings playable on the minimum system (Figure 2). A list of the specific
options supported by the minimum system was needed due to the large
number of them available in AoK (Figure 3). These were also the default
options for the single-player and multiplayer games, and were used to
guide the creation of approximately 30 single-player scenarios.
Figure 2. AoK minimum system game play specifications.
|4 players; any combination of human and computer players|
|4 players map sizes|
|75 unit population cap|
|Low-detail terrain graphics quality|
as part of scalability effort
Figure 3. Game play and feature scability
|Number of Players||2 to 8, in any combination of human or computer|
|Size of Map||2 to 8 player sizes and "giant" size|
|Type of Map||All land (Arabia)|
|Mostly water (islands)|
|Nine other in between (Coastal, Baltic, and so on)|
|Unit Population Cap||25 to 200 units per player|
|Civilization Sets||Western European, Eastern European, Middle Eastern, Asian|
|Three Terrain Detail Modes||High detail -- multi-pass, anisotropic filtering, RGB color calculation|
|Medium detail -- multi-pass, fast, lower-quality filtering. RGB color calculation|
|Low detail -- single pass, 8-bit color lookup|
One of the first tasks of this dedicated performance phase was to determine the largest performance problems, the improvements that we could hope to make, and the likelihood that AoK would meet the minimum system specification in terms of processor and physical memory. This initial profiling process led us to increase the minimum required processor speed from 133 to 166MHz. We also felt that meeting the 32MB memory size could difficult but we were fairly certain that the memory footprint could be reduced enough to meet that goal.
Grist for Profiling
No matter how good or bad a program looks when viewed through the lens of profiling statistics, the only true test of satisfactory performance is how players feel about their game experience. To help correlate player responses with game performance in AoK, we used several on-screen counters that displayed the average and peak performance. Of these counters, the ones that calculated the average frame rate and lowest frame rate over the last several hundred frames were used most to determine performance problems. Additional statistics included average and peak game simulation time (in milliseconds) over the last several hundred game updates.
Identifying symptoms of play-testing performance problems and making saved games of these problem situations was very useful. We replayed saved games in the profiler, and routines that took too long could be identified quickly. Unfortunately, some problems were difficult to track down, such as memory leaks and other programs running on the play-tester's computer.
We also created scenarios that stressed specific situations. For instance, we stressed the terrain engine's hill-drawing by using a special scenario consisting of a large game map covered mostly with hills. Other special scenarios were created that included many buildings, walls, or attempts to path units long distances between difficult obstacles. These scenarios were easy to build, and it was obvious the first time the scenario was run whether a given issue needed to be targeted for optimization.
The final set of data came in the form of recorded AoK games. AoK has a feature that allows human or computer player commands to be recorded to a file. This data can then be played back later as if the original player were issuing the commands. These recorded games helped diagnose pathfinding problems when it was unclear how a unit had arrived at a particular destination.
Since AoK was able to load scenarios, saved games, and recorded games from the command line, the game could be run automatically by a profiler. This simplified the profiling process by allowing the profiler to run AoK and have it jump directly into the problem. This command-line process bypassed the startup and pregame option screens. (Some profilers slowed the game down so much that manually loading a saved game from the profiler would have been impossible.) And since performance profiling and logging significantly slowed game play, analyzing recorded games was a much better solution from the tester's perspective. Multiplayer games could be recorded and then played back command-for-command under the profiler overnight to investigate performance issues.