Age
of Empires II: The Age of Kings (AoK), a tile-based, 2D isometric,
real-time strategy game, was built on the code base used in the original
Age of Empires (AoE) and extended in its Rise of Rome
expansion pack. In these games, players guide one of many civilizations
from the humble beginning of a few villagers to an empire of tens or
hundreds of military and non-military units, while competing against
other human or computer-controlled opponents in single or multiplayer
modes.
This is
the first of a two-part article that describes the tips, tricks, tools,
and pitfalls that went into raising the performance profile of Age
of Empires II: The Age of Kings. All of the techniques and tools
used to measure and improve AoK are fully capable of improving the performance
of other games.
Beginning
the Diagnosis
In some
ways, the AoK development team was fortunate because we had the benefit
of an existing code base to work with. Many performance improvements
went into AoE, including extensive optimization of its graphics drawing
core, and this work gave us a good starting point for AoK.
Still,
a significant amount of new functionality was added over the course
of the sequel's two-year development cycle. This new functionality,
as well as new requirements placed on existing functionality, meant
that there was a large amount of new work to do in order to meet the
minimum system requirements for shipping AoK. As such, a dedicated performance
improvement phase began in April 1999 to ready AoK for its September
1999 release. The purpose of this phase was to identify and resolve
the game's remaining outstanding performance issues, and to determine
whether AoK would perform well on the intended minimum system configuration.
 |
AoK's
graphics pipeline added new features to AoE's original system,
implemented with a combination of C/C++ and hand-coded Assembly.
|
Our team
had some ideas as to which parts of the code were taking a long time
to execute, and we used Intel's VTune, NuMega's TrueTime, and our own
profiling code to verify these hunches and see exactly where time was
being spent during program execution. Often these performance results
alone were enough to determine solutions, but sometimes it wasn't clear
why the AoK code was underperforming, and in these cases we analyzed
the data and data flow to determine the nature of the problem.
Once a
performance problem is identified, several options are available to
fix it. The most straightforward and recognized solution is direct code
optimization by optimizing the existing C code, translating it to hand-coded
x86 Assembly, rearranging data layouts, and/or implementing an alternative
algorithm.
Sometimes
we found that an algorithm, though optimal for the situation, was executing
too often. In one case, unit pathing had been highly optimized, but
it was being called too often by other subsystems. In these cases, we
fixed the problem by capping the number of times the code could be called
by other systems or by capping the amount of time the code could spend
executing. Alternately, we might change the algorithm so its processing
could occur over multiple game updates instead of all at once.
We also
found that some functionality, no matter how much we optimized it, still
executed too slowly. For example, supporting eight players in a game
required too much processor time on the minimum system, so we specified
that the minimum system could support only four players. We presented
scalability features such as this as facets of game play or as options
that players could adjust to their liking. These scalable features ultimately
allowed AoK to run well on its stated minimum system, providing incentives
or rewards to users who have better computers.
And then
there were AoK's approximately 30 single-player scenarios. We evaluated
the performance of these scenarios slightly differently from other game
functionality. Instead of trying to optimize offending code, we first
examined the scenario for performance problems that had been inadvertently
introduced by the scenario designers in their construction of the scenario
and its elements. In many cases, performance improved significantly
with slight changes to the scenario, for example, reducing the number
of player units, shrinking the game map, or making sections of the maps
inaccessible to players. Above all, we made sure that we did not change
the designer's vision of the scenario as we optimized it.
Shopping
for Old Hardware
One of
the goals of AoK was to keep the system requirements as low as possible.
This was necessary in order to reach the broadest audience possible
and to stay on the same incremental processor performance ramp set by
the original Age of Empires and its Rise of Rome expansion pack. Our
overriding concern was to meet these minimum system requirements, yet
still provide an enjoyable game experience.
The original
Age of Empires was released in September 1997, and required a 90MHz
Pentium processor with 16MB RAM and a 2D graphics card capable of handling
8-bit palletized color. The Rise of Rome expansion pack shipped a year
later and raised the minimum system processor to a 120MHz Pentium. Based
on this information, the AoK minimum processor was pegged as a 133MHz
Pentium with 32MB of physical RAM (Figure 1). The additional RAM was
required due mainly to the increased size and number of graphics and
sound files used by AoK. There was also a greater amount of game data
and an executable that grew from approximately 1.5MB for AoE to approximately
2.4MB for AoK.
To make
sure AoK worked on the minimum system, we had to shop for old hardware.
We purchased systems matching the minimum system specification from
a local system reseller - we no longer used systems that slow. When
the "new" computers arrived, we decided not to wipe the hard
drives, nor did we reinstall software and hardware with the latest driver
versions. We did this because we expected that most AoK users wouldn't
optimize their computer's configuration or settings, either. Optimizing
these systems would have undoubtedly improved our performance numbers,
but it would not have translated into true performance gains on other
minimally-configured computers. On the other hand, for normal in-house
play-testing, we used computers that were significantly more powerful
than the minimum system configuration, which made up for performance
issues caused by unoptimized code and enabled logging functions during
play-testing (Figure 1).
Figure
1. AoK minimum PC and play-test PC system configurations.
|
Minimum
System Spec Test PC
|
Ensemble
Play-Test PC
|
133
MHz Pentium Processor |
450
MHz Pentium II processor |
S3 Virge
2MB graphics card |
Nvidia
TNT 8MB graphics card |
32 MB
RAM |
128
MB RAM |
Windows
98 |
Windows
98 |
*later
upgraded to 166 MHz
A precedent
set by the original Age of Empires was the use of options and
settings playable on the minimum system (Figure 2). A list of the specific
options supported by the minimum system was needed due to the large
number of them available in AoK (Figure 3). These were also the default
options for the single-player and multiplayer games, and were used to
guide the creation of approximately 30 single-player scenarios.
Figure
2. AoK minimum system game play specifications.
|
4
players; any combination of human and computer players |
4 players map sizes |
75
unit population cap |
800X600 resolution |
Low-detail
terrain graphics quality |
*added
as part of scalability effort
Figure
3. Game play and feature scability
|
Number
of Players |
2
to 8, in any combination of human or computer |
Size of Map |
2 to 8 player sizes
and "giant" size |
Type
of Map |
All
land (Arabia) |
|
Mostly water (islands) |
|
Nine
other in between (Coastal, Baltic, and so on) |
Unit Population Cap |
25 to 200 units per
player |
Civilization
Sets |
Western
European, Eastern European, Middle Eastern, Asian |
Resolution |
800X600 |
|
1024X768 |
|
1280X1024 |
Three
Terrain Detail Modes |
High
detail -- multi-pass, anisotropic filtering, RGB color calculation |
|
Medium detail -- multi-pass,
fast, lower-quality filtering. RGB color calculation |
|
Low
detail -- single pass, 8-bit color lookup |
One of
the first tasks of this dedicated performance phase was to determine
the largest performance problems, the improvements that we could hope
to make, and the likelihood that AoK would meet the minimum system specification
in terms of processor and physical memory. This initial profiling process
led us to increase the minimum required processor speed from 133 to
166MHz. We also felt that meeting the 32MB memory size could difficult
but we were fairly certain that the memory footprint could be reduced
enough to meet that goal.
Grist
for Profiling
No matter
how good or bad a program looks when viewed through the lens of profiling
statistics, the only true test of satisfactory performance is how players
feel about their game experience. To help correlate player responses
with game performance in AoK, we used several on-screen counters that
displayed the average and peak performance. Of these counters, the ones
that calculated the average frame rate and lowest frame rate over the
last several hundred frames were used most to determine performance
problems. Additional statistics included average and peak game simulation
time (in milliseconds) over the last several hundred game updates.
Identifying
symptoms of play-testing performance problems and making saved games
of these problem situations was very useful. We replayed saved games
in the profiler, and routines that took too long could be identified
quickly. Unfortunately, some problems were difficult to track down,
such as memory leaks and other programs running on the play-tester's
computer.
We also
created scenarios that stressed specific situations. For instance, we
stressed the terrain engine's hill-drawing by using a special scenario
consisting of a large game map covered mostly with hills. Other special
scenarios were created that included many buildings, walls, or attempts
to path units long distances between difficult obstacles. These scenarios
were easy to build, and it was obvious the first time the scenario was
run whether a given issue needed to be targeted for optimization.
The final
set of data came in the form of recorded AoK games. AoK has a feature
that allows human or computer player commands to be recorded to a file.
This data can then be played back later as if the original player were
issuing the commands. These recorded games helped diagnose pathfinding
problems when it was unclear how a unit had arrived at a particular
destination.
Since
AoK was able to load scenarios, saved games, and recorded games from
the command line, the game could be run automatically by a profiler.
This simplified the profiling process by allowing the profiler to run
AoK and have it jump directly into the problem. This command-line process
bypassed the startup and pregame option screens. (Some profilers slowed
the game down so much that manually loading a saved game from the profiler
would have been impossible.) And since performance profiling and logging
significantly slowed game play, analyzing recorded games was a much
better solution from the tester's perspective. Multiplayer games could
be recorded and then played back command-for-command under the profiler
overnight to investigate performance issues.