First of all, I'd like to welcome all of you to the Optimizations Corner. This column will be a regular monthly column at Gamasutra that will feature articles related to optimization methodologies, techniques, tool usage, and so on. This first article will explore the area of data optimizations. However, I'd like to start with a few introductory ideas about optimizations in general…
Firstly, I must confess that my area of expertise does not lie in game development, per se. That is, I have never written a game. (If fact, I'm not even that good at playing them.) However, I (and other members of my team) have been heavily involved in application optimizations, especially 3D games. I also have another thing going for me in that I do work for a company that knows a bit about the CPU and the PC platform. In addition, I am also extremely fortunate to be surrounded by a lot of talented people who will be contributing to this series from time to time. Because of this, I hope to bring a valuable viewpoint to the game development community.
Enough said. I won't bore you with any more about the article series in general. But I do want to add a few comments about optimizations and their role in the gaming industry.
Don't Oppress Your Artists
Actually, this section could have been called, "Why should you still be interested in optimizations?" It used to be that a highly tuned program was required just so it could run at a decent level of performance. With PCs getting faster all the time, why are optimizations still so important? Even if you tune your code, will the overall speedup obtained be worth the effort?
Maybe it's time to look at optimizations in a different light.
Think of it this way. Optimizations are opportunities to free your artists a little more. Artists are creative people, but artists for game companies are creative people with chains on their paint brushes. Basically, the company management (possibly together with the publisher) specify the target platforms for the product. The techno-wizards say that they can achieve a certain level of performance on the target platforms, then they all look at the artist and tell them, "make it look great, but with only so many polygons, textures, etc." That is, the artistic budget for the game is set by reverse engineering the performance, starting with the marketing process of specifying target platforms for the product. Thus, artists are creatively oppressed.
Optimizations can be thought of as a way to loosen those chains on your artists. And you might be surprised to learn that many optimizations can be done with very little effort. In fact, an often overlooked place for optimizations is in the data itself.
And it is with this note that I finally head into the main subject of this first paper:
Data Optimizations: It's Not Just Code
It turns out that optimizing applications is not just dependent upon the code. It's very often in the way you organize, store and present the data to the application. In this section, I'll explore the effect of primitive size and how it affects performance. More importantly, I'll show how possibilities for optimization that will let you to increase your effective content at far less cost in performance than you might think.
"Don't worry... it's not the size that counts."
Well, I won't try to verify or deny the above statement with regards to human relationships, but it's definitely a false statement when it comes to 3D models and the way in which they're organized. When measuring performance for a 3D engine, the size of each primitive processed drastically affects the performance of the engine.
Most 3D engines process vertex data in primitive fashion. Let's use D3D as an example (although the principal I'm discussing applies to basically any engine), in which models are represented as a collection of triangles in either discrete, strip, or fan form. This is done in either an indexed or ordered fashion, with the triangles batched up in collections that represent the same geometry state. Although this paper focuses on indexed triangle lists, the principles still hold for other primitive types.
is an example of an index primitive. Notice that there are really two
structure: the indices (that specify connectivity of vertices) and the
vertices themselves (or the vertex pool). The indices are a collection
of pointers to vertex structures that include position, normal, texture,
coordinates, and so on.
Notice that the location of the 3 vertices of the first triangle (represented in the animation) can appear at random places in the vertex pool. Also keep in mind that the vertex pool is usually a much larger structure (often a cache line per vertex), whereas each index is only a pointer.
Regardless of whether the structure below represents a discrete triangle list, triangle strip, or fan, the size of the primitive is related to the overall size of the vertex pool. It is these vertices that must be processed, and when I speak of the size of the primitive, I'm really talking about the size of the vertex pool.
An Example of an Index Primitive
This primitive represents a collection of vertices in the vertex pool and a collection of indices all using the same geometry state. The geometry state specifies things such as transformation matrices, lights, material properties and so on. From now on, I'll just refer to this as state. Very often, a primitive corresponds to an object (or a piece of one object) in a scene, but it doesn't have to. What's important is that all the content in the primitive has the same state. In fact, multiple primitives of the same state can be batched together to form a much larger primitive. This multi-primitive batching of smaller primitives is one performance optimization step for the data.
Since I'm so concerned about size in this article (no jokes please... unless you're obsessed about triangle strip envy...), let's take a look at throughput rates as they relate to primitive size. Looking below in Figure 2, you see a plot of throughput performance versus primitive size (for given transform and lighting modes of DX7 measured on a Pentium III). You can see that the graph is not flat at all and in fact shows that performance depends largely on primitive size.
Performance depends largely on primitive size.