|
First of
all, I'd like to welcome all of you to the Optimizations Corner. This
column will be a regular monthly column at Gamasutra that will feature
articles related to optimization methodologies, techniques, tool usage,
and so on. This first article will explore the area of data optimizations.
However, I'd like to start with a few introductory ideas about optimizations
in general…
Firstly,
I must confess that my area of expertise does not lie in game development,
per se. That is, I have never written a game. (If fact, I'm not even
that good at playing them.) However, I (and other members of my team)
have been heavily involved in application optimizations, especially
3D games. I also have another thing going for me in that I do work for
a company that knows a bit about the CPU and the PC platform. In addition,
I am also extremely fortunate to be surrounded by a lot of talented
people who will be contributing to this series from time to time. Because
of this, I hope to bring a valuable viewpoint to the game development
community.
Enough
said. I won't bore you with any more about the article series in general.
But I do want to add a few comments about optimizations and their role
in the gaming industry.
Don't
Oppress Your Artists
Actually,
this section could have been called, "Why should you still be interested
in optimizations?" It used to be that a highly tuned program was
required just so it could run at a decent level of performance. With
PCs getting faster all the time, why are optimizations still so important?
Even if you tune your code, will the overall speedup obtained be worth
the effort?
Maybe
it's time to look at optimizations in a different light.
Think
of it this way. Optimizations are opportunities to free your artists
a little more. Artists are creative people, but artists for game companies
are creative people with chains on their paint brushes. Basically, the
company management (possibly together with the publisher) specify the
target platforms for the product. The techno-wizards say that they can
achieve a certain level of performance on the target platforms, then
they all look at the artist and tell them, "make it look great,
but with only so many polygons, textures, etc." That is, the artistic
budget for the game is set by reverse engineering the performance, starting
with the marketing process of specifying target platforms for the product.
Thus, artists are creatively oppressed.
Optimizations
can be thought of as a way to loosen those chains on your artists. And
you might be surprised to learn that many optimizations can be done
with very little effort. In fact, an often overlooked place for optimizations
is in the data itself.
And it
is with this note that I finally head into the main subject of this
first paper:
Data
Optimizations: It's Not Just Code
It turns
out that optimizing applications is not just dependent upon the code.
It's very often in the way you organize, store and present the data
to the application. In this section, I'll explore the effect of primitive
size and how it affects performance. More importantly, I'll show how
possibilities for optimization that will let you to increase your effective
content at far less cost in performance than you might think.
"Don't
worry... it's not the size that counts."
Well,
I won't try to verify or deny the above statement with regards to human
relationships, but it's definitely a false statement when it comes to
3D models and the way in which they're organized. When measuring performance
for a 3D engine, the size of each primitive processed drastically affects
the performance of the engine.
Most 3D
engines process vertex data in primitive fashion. Let's use D3D as an
example (although the principal I'm discussing applies to basically
any engine), in which models are represented as a collection of triangles
in either discrete, strip, or fan form. This is done in either an indexed
or ordered fashion, with the triangles batched up in collections that
represent the same geometry state. Although this paper focuses on indexed
triangle lists, the principles still hold for other primitive types.
Below
is an example of an index primitive. Notice that there are really two
structure: the indices (that specify connectivity of vertices) and the
vertices themselves (or the vertex pool). The indices are a collection
of pointers to vertex structures that include position, normal, texture,
coordinates, and so on.
Notice
that the location of the 3 vertices of the first triangle (represented
in the animation) can appear at random places in the vertex pool. Also
keep in mind that the vertex pool is usually a much larger structure
(often a cache line per vertex), whereas each index is only a pointer.
Regardless
of whether the structure below represents a discrete triangle list,
triangle strip, or fan, the size of the primitive is related to the
overall size of the vertex pool. It is these vertices that must be processed,
and when I speak of the size of the primitive, I'm really talking about
the size of the vertex pool.
An
Example of an Index Primitive
This
primitive represents a collection of vertices in the vertex pool and
a collection of indices all using the same geometry state. The geometry
state specifies things such as transformation matrices, lights, material
properties and so on. From now on, I'll just refer to this as state.
Very often, a primitive corresponds to an object (or a piece of one
object) in a scene, but it doesn't have to. What's important is that
all the content in the primitive has the same state. In fact, multiple
primitives of the same state can be batched together to form a much
larger primitive. This multi-primitive batching of smaller primitives
is one performance optimization step for the data.
Since
I'm so concerned about size in this article (no jokes please... unless
you're obsessed about triangle strip envy...), let's take a look at
throughput rates as they relate to primitive size. Looking below in
Figure 2, you see a plot of throughput performance versus primitive
size (for given transform and lighting modes of DX7 measured on a Pentium
III). You can see that the graph is not flat at all and in fact shows
that performance depends largely on primitive size.
|
|
|
Performance
depends largely on primitive size.
|
|