Because of the problems with creating the texture maps and the computational costs during runtime, real-time spherical environment mapping is not often used in games. As a result, when the technique is used, the spherical maps are usually pre-calculated and therefore the don't reflect changes in a scene as they happen. Fortunately, some DirectX 7-capable video cards support cubic environment maps, which don't exhibit any of the problems associated with spherical maps, and thus they're suitable for reflecting dynamic scenes. Despite their limitations though, spherical environment maps are still useful. Using sphere maps, you can create very high performance and cheap static reflections which in most cases are good enough for game reflections, another very useful example is creating realistic specular highlights from an infinite light source.
This article will show a hardware T&L accelerated method of using sphere maps, it is assumed that your game will have some level of geometry hardware acceleration in addition to Direct3D support. If geometry acceleration is not present, applying these techniques may actually slow down a game (especially if the standard Direct3D software pipeline is used).
To begin our look at spherical mapping let's look at the Spheremap demo, one of the samples that comes with DirectX 7 (to find this demo, search the DirectX 7 CD-ROM for SPHEREMAP.EXE). This application displays a spherically environment-mapped teapot. Figure 1 shows a screenshot from this application.
The Spheremap demo implements what I call "normal" spherical mapping, where the normal vector at a vertex is used in place of the eye-to-vertex reflection vector. The code that performs the mapping within this demo is shown in listing 1 (in the DirectX SDK source file this can be found in a function named ApplySphereMapToObject() ).
Figure 1. A screenshot from SPHEREMAP.EXE
Unfortunately, the Spheremap demo has many shortcomings and doesn't implement spherical reflection mapping like OpenGL (when the automatic texture address generation is set to GL_SPHERE_MAP). In fact, Direct3D has no sphere map support at all - you have to calculate the texture coordinates yourself. To do so, you could create a system that cycled through all of the vertices and have the CPU calculate the texture coordinates, this is what the DirectX7 Spheremap demo does but this is far from efficient.
The DirectX 7 documentation and various pieces of literature from the graphics vendors stress over and over that correctly using and managing vertex buffers is the key to getting high performance - especially with hardware T&L-based cards. Static vertex buffers are the most efficient, as they can be kept in local video memory and never updated (i.e. optimized), but that means that all geometry processing has to be performed with the standard hardware pipeline, limiting the effects that you can create. Even so, it is surprising what can be done when all the resources of the pipeline are used.
must have dynamic geometry, a carefully managed CPU-modifiable
vertex buffer is still better than no vertex buffer (as in the the SPHEREMAP.EXE
example). However, the Spheremap sample code is one of those pathological
cases where vertex buffers are actually slower - if you converted that
code to use video memory vertex buffers, it would most certainly slow
down since the normal is read back from the vertex buffer (which is
taboo, as both video memory and AGP memory are uncached). If the vertex
buffer happens to be in local video memory, then it's being fetched
back over the AGP bus, which is painfully slow. In this case, keeping
a second copy of the normal vectors in system memory would be best.
Also, note that there's a glaring mistake in the DirectX algorithm, which I am compelled to point out. It is the line commented, "Check the z-component, to skip any vertices that face backwards". Vertices do not face backwards, polygons do; it is perfectly legal for a polygon to have a vertex normal that points away from the viewer while still having a face normal pointing towards the viewer:
The results of the erroneous z-component check can be seen in the DirectX 7 example when the bottom of the teapot comes into view. For a few frames, a number of triangles are not textured properly. This check is not only an error, it causes the loop to run slower (well, it certainly doesn't speed it up). Without the check, there would be 2N dot products (where N is the number of vertices). With the check in place, and assuming half of the vertices face away from the viewer, there are N+2N/2 = N+N = 2N vertices, so the same amount of work is done. The difference is that now there is a jump in the middle of the loop in which the CPU has to predict or mispredict. On a Pentium II or III, a mispredicted jump is far more expensive than a couple of dot products.