| |
|
|
||||
![]() |
||||||
| |
|
|||||
|
Run-Time MIP-Map Filtering Graphics
programmers are constantly looking for ways to improve the realism of
the graphics in games. One of the simplest techniques employed to do this
is texture mapping, and while texture mapping does add considerable realism
to a scene, it also adds a number of new problems. The most obvious visual
problems that appear when using textures in a scene are the aliasing artifacts
that are visible when texture-mapped polygons are some distance from the
viewpoint. If you're moving rapidly around your virtual world, these artifacts
appear as flashing or sparkling on the surface of the texture. Or, if
the viewpoint is fixed, the artifacts appear as unwanted patterns within
the texture after it has been mapped to a polygon. This is clearly visible
in Figure 1, where the checkered texture map becomes distorted as its
distance from the viewpoint increases.
MIP-mapping helps alleviate this problem. The acronym MIP comes from the Latin phrase multum in parvo, meaning "many things in a small place." Researchers at the New York Institute of Technology adopted this term in 1979 to describe a technique whereby many pixels are filtered and compressed into a small place, known as the MIP-map. To see how MIP-maps improve visual clarity, see Figure 2, in which MIP-mapping with bilinear filtering has been used to smooth the texture.
In order to understand what is what's causing the problems in the Figure 1, you have to look within the texture-mapping renderer and understand how the process of sampling the texture maps affects what's displayed on the screen. Look at Figure 3A, in which a sine wave is being sampled at a much higher frequency than the wave itself. As you can see, a fairly good representation of the wave can be obtained from these samples. However, if the sampling frequency drops to exactly two times the frequency of the wave, as shown in Figure 3B, then it's possible that the sampling points will coincide with the zero crossing points of the sine wave, resulting in no information recovery. Sampling frequencies of less than twice that of the sine wave being sampled, as shown in Figure 3C, causes the information within the samples to appear as a sine wave of lower frequency than the original. From these figures, we can guess that for complete recovery of a sampled signal, the sampling frequency must be at least twice that of the signal being sampled. This is known as the Nyquist limit. So, from where does the seemingly magic value of twice the signal being sampled come? In order to answer, that we'll have to digress a bit further and take a stroll into the Fourier domain.
A
complete discussion of Fourier theory could take up several books by
itself, so for those of you who haven't suffered through a signal-processing
course at college, I suggest that you take a look at the text by Bracewell
that's mentioned at the end of this article. What follows is a very
limited introduction to Fourier transforms and sampling, but it should
be enough to demonstrate how the Nyquist limit is derived.
The
effect of sampling h(t) with the sampling function s(t) is shown in
Figure 4C. In the time domain, the sampling can be thought of as multiplying
h(t) by s(t), and in the frequency domain, it can be thought of as
the convolution of H(f) and S(f).
Convolution of any two functions f(x) and g(x) is given by
If the thought of plugging the Fourier transforms of both h(t) and s(t) into Equation 4 has you wanting to skip to the current Soapbox article (p.72 November issue of Game Developer), just hold on a second - it isn't as bad as it looks. The convolution of a single impulse located at t=t0, with h(t) is just the value of h(t) shifted to that location.
We can apply the result of Equation 5 to find the convolution of H(f) and S(f).
Equation 6 simply means that the result of the convolution of H(f) and S(f) is such that H(f) is duplicated at intervals of 1/Ts, as can be seen in Figure 4C. The sinc2x function is bandlimited (that is, its bandwidth is limited) to fmax, so the only requirement needed to ensure that there are no overlapping portions in the spectrum of the sampled signal is that fs>2fmax, where fs=1/Ts. So, this is from where the Nyquist limit comes. As you can see in Figure 4D, if the sampling frequency drops below 2fmax, adjacent spectra overlap at higher frequencies, and these frequencies are then lost in the resulting signal. However, instead of disappearing completely, these high-frequency signals reappear at lower frequencies as aliases; this is where the term aliasing originated. To prevent aliasing from occurring, either the signal being sampled must be bandlimited to less than 2fs or the sampling frequency must be set to be higher than 2fmax.
Let's
look at how MIP-mapping helps to reduce aliasing artifacts in our
texture-mapped image. Remember that texture mapping is designed to
increase the realism and detail in scenes. However, all of the fine
details in the texture maps are effectively-high frequency components
and they are the cause of our aliasing problems. Since we can't really
modify our sampling frequency (1/DU and 1/DV in the texture-mapping
portion of our renderer), we have to filter the textures to remove
the high-frequency details.
Figure
6 (below) shows some of the problems you can encounter when selecting
which LOD to apply at run time. In the figure, the rectangular texture
that's mapped onto the triangle in texture space is transformed into
a quadrilateral in screen space, and the perspective projection of
the texture causes the individual texels to become quadrilaterals
of varying sizes. In a case such as this, where the orientation of
a polygon is skewed in screen space, determining the best LOD to apply
to a polygon is especially crucial if you want to produce good visual
results. If the chosen LOD is too high (the texture dimensions are
too large), aliasing will occur in the texture. If the LOD is too
low (the dimensions of the texture are too small), then the image
will appear blurred. For example, the LOD chosen for the texture in
Figure 7 (below) is much too low, as can be seen by the large texels
visible in the inset zoomed image. Many different methods can be used
for LOD selection, all of which have advantages and disadvantages.
The two well-known methods that we'll examine here are the selection
of the LOD based on the area of the texture in screen space, and the
selection of the LOD using the projected u and v vectors.
Per-polygon
MIP-map selection is the least expensive method from a computational
standpoint, because you only do MIP-map selection once per polygon.
There are, however, a couple of drawbacks to this approach. One problem
is that adjacent polygons that share a texture may be drawn using
differing LODs; this will be appear as a discontinuity in the texture
when displayed on the screen (this is called MIP-banding). Figure
8 shows a small amount of MIP-banding that is occurring due to the
use of per-triangle MIP-mapping. Another problem is that visible popping
may occur as a texture's LOD is changed due to movement of the viewpoint
(or the polygon).
Area-Based LOD Selection Area-based LOD selection complements per-polygon MIP-mapping techniques. In this method, you select the LOD by determining the amount of texture compression that will be applied to the texture in screen space. To determine the proper texture compression, you calculate the area of the polygon in screen space and the area, in texture space, of texture that is mapped onto the polygon. As shown in Listing 5, you can determine the ratio of texels to pixels and then determine which LOD to use. [In the interest of conserving editorial space, code listings are available for download from Game Developer's ftp site. ftp://ftp.gdmag.com/src/nov98.zip] The u and v dimensions of each successive LOD are one-half the size of the preceding LOD, so each successive LOD has one-quarter the area of the preceding level. During LOD selection, we step up one level in the MIP-map pyramid for each multiple of four that the texel area is greater than the pixel area. For example, if the texel-to-pixel ratio is 3:1, we would select MIP-map level zero, or, if the texel-to-pixel ratio is 7:1, we would select MIP-map level one. Once the LOD has been selected, we can pass a pointer to the correct LOD, along with the LOD's dimensions, to our normal texture-mapping routines. One problem with any approach that uses the projected area of the polygon and the texture area as the basis for LOD selection is that aliasing will tend to occur whenever a projected polygon is very thin, due to the anisotropic nature of the texture compression (that is, the texture is compressed more in one dimension than the other). Per-pixel MIP-mapping offers far better control of LOD selection than per-polygon MIP-mapping, and it also permits additional texture filtering - but at some additional cost. All of the per-pixel methods require storage of the entire MIP-mapped texture in memory, and adding LOD selection to the inner loop of a renderer's texture-mapping routine can significantly reduce rendering performance. Fortunately, most of today's 3D chips support per-pixel MIP-mapping with bilinear filtering (a few of the latest devices even support trilinear filtering), so we'll look at what it takes to implement sophisticated per-pixel MIP-mapping. Although we could use area-based LOD selection here also (we'd need to calculate the texture area underneath each pixel rather than for the entire polygon), we'll look at an all-together more accurate method. Edge Compression-Based LOD Selection In
1983, Paul Heckbert probably examined more LOD calculation techniques
than he'd care to remember before he decided that techniques based
on the compression that a texture suffers along the edge of a pixel
seem to work best. Figure 9 shows a single pixel in screen space
and the corresponding parallelogram in texture space. To prevent
aliasing from occurring, we want to select the LOD based on the
maximum compression of an edge in texture space. This corresponds
to the maximum length of a side in texture space, which is given
by
The values of ux, uy, vx, and vy are given by four partial derivatives. Because we already know how to calculate the u and v values for any pixel on the screen, we can use this knowledge to determine the partial derivatives. We know that, given the u/z, v/z, and 1/z gradients in x and y, and the starting u/z, v/z, and 1/z values at the screen origin, the u and v values for the texture at any pixel can be found using Equations 8 and 9. The notation in Equations 8 through 19 is derived from Chris Hecker's series on perspective texture mapping, which can be found on his web site (see "Acknowledgements" for the URL).
We can use these results to find the partial derivatives, as shown in Equations 10 through 13.
Point-Sample Per-Pixel MIP-Mapping Point
sampling is the simplest form of per-pixel MIP-mapping, and as you
can see in Listing 8, there isn't much difference between our normal
texture-mapping loop and one that uses point sampling. Once we've
found the amount of edge compression for the current pixel, we need
to determine the correct LOD. The raw compression value ranges from
a zero to one, but we need to scale it by the texture dimensions
to get a meaningful height in our MIP-map pyramid. Once we have
the height, we determine the correct LOD by stepping up one level
in the pyramid for each power of two that the height is greater
than one. We then use our fast LOD lookup table to get a pointer
to our texture and access the correct texel as usual. Figure 10
shows the same object that we used to generate Figure 8, but this
time we're applying point-sampled MIP-mapping. As you can see in
the figure, the main problem with point-sampled MIP-mapping is that
MIP-banding is clearly visible at the points where transitions between
different LODs occur. This is because adjacent pixels can have different
LODs, so a discontinuity appears as we switch between LODs.
Bilinearly-Filtered Per-Pixel MIP-Mapping Bilinear
filtering attempts to further reduce any aliasing errors present
in a scene by averaging the values of the four pixels that are closest
to the real u and v texture values for each pixel. As you can see
in Figure 11 and Listing 9, bilinear interpolation can be implemented
using three linear interpolations. We calculate the correct LOD
and retrieve the pointer to our texture in exactly the same way
that we did with point sampling. However, we then retrieve four
texture values and apply bilinear interpolation to each color component
to generate the new pixel value. Figure 12 shows our road after
MIP-mapping and bilinear filtering. Although Figure 12 is an improvement
over Figure 10, you can still make out the MIP-banding. Nothing
has been done to remove the discontinuities that occur when we switch
between LODs.
The
current state-of-the-art for 3D hardware-accelerated MIP-mapping
is trilinear filtering. Trilinear filtering attempts to remove the
problems associated with MIP-banding by smoothly blending between
differing LODs. As you can see in Listing 10, we once again calculate
the correct LOD in exactly the same way that we did it for point
sampling, then retrieve pointers to the calculated LOD and the next
lower LOD (the next level up in the pyramid). Trilinear interpolation
is implemented using eight linear interpolations. We begin by carrying
out bilinear interpolation separately for each of the selected LODs,
then finish off by linearly interpolating between the two LODs.
As you can see in Figure 13, trilinear interpolation does result
in a smooth transition between LODs (though the overall scene appears
somewhat blurred). Unfortunately, this feature comes at a considerable
cost: the straightforward implementation of trilinearly filtered
MIP-mapping presented here requires eight texture accesses for each
pixel and a considerable amount of computation. Although it's possible
to cut down on the number of texture look-ups by saving texel values
between loop iterations, the interpolations themselves need to be
performed for each loop, so achieving acceptable frame rates with
software-based trilinear filtering is very difficult.
We've
covered a lot of ground for one article, and although the output
of our renderer using trilinear MIP-mapping is significantly better
than plain old texture mapping, it still isn't perfect. The biggest
defect remaining in our filtering is that, as I mentioned earlier,
we've ignored the fact that the texture compression is anisotropic.
We're selecting LODs using the maximum compression along one edge,
but what if there's a significant difference in the amount of compression
between each edge? In this case, the LOD selected will be too low
for the least compressed edge, and our scene will appear blurred.
You can clearly see this effect in Figure 14, which is a screen
shot from the CHAOSVR demo that was rendered using a card based
on 3Dfx's Voodoo2 chipset. This problem will occur with any 3D accelerator
that uses methods similar to the ones that we've developed here
for calculating the LOD - not just the Voodoo2 card that I'm using.
Clearly, the next step to improve rendering accuracy will be to
adopt some form of anisotropic filtering. I'm sure that it won't
be long before this capability appears on high-end accelerators.
Acknowledgements Thanks
go out to Chris Hecker who kindly allowed me to plug my MIP-mapping
into his texture mapping routines, saving me a lot of time. Check
out Chris's home page, http://www.d6.com/users/checker,
for more information on texture mapping and his old columns from
Game Developer. For Further Info Bracewell,
R. N., The Fourier Transform and its applications, McGraw-Hill Book
Co., New York, 1986. |
|
|