Excerpted from Real-Time Rendering
(AK Peters, 1999)
One of the great myths concerning computers is that one day we will have enough processing power. Even in a relatively simple application like word processing, we find that additional power can be applied to all sorts of things, such as on-the-fly spell and grammar checking, more elaborate graphic presentation, antialiased text display, automated voice recognition and dictation, etc.
In real-time rendering we have at least three performance goals: more frames per second, higher resolution, and more (and more realistic) objects in the scene. A speed of 60-72 frames per second is generally considered enough, and perhaps 1600x1200 is enough resolution for a while, but there is no real upper limit on scene complexity. The rendering of a Boeing-777 would include 132,500 unique parts and over 3,000,000 fasteners, which would yield a polygonal model with over 500,000,000 polygons [Cripe98]. Our conclusion: speed-up techniques and acceleration schemes will always be needed.
In this article, we will talk about a certain class of acceleration scheme called occlusion culling techniques. Most of this article is an excerpt from chapter 7, "Speed-Up Techniques" from our book Real-Time Rendering (www.realtimerendering.com or www.acm.org/tog/resources/RTR/). In the book, the occlusion culling section is preceded by sections on backface and clustered culling, hierarchical view-frustum culling, portal culling, and detail culling. Sections on impostor algorithms, level-of-detail techniques, triangle fan, strip and polygon mesh techniques follow after.
can mean to "select from a flock," and in the context of computer
graphics this is exactly what culling techniques do. The flock
is the whole scene that we want to render, and the selection is limited
to those portions of the scene that are not considered to contribute
to the final image. The rest of the scene is sent through the rendering
pipeline. The actual culling can theoretically take place at any stage
of the rendering pipeline. For culling algorithms that are implemented
in hardware, we can typically
only enable/disable or set some parameters for the culling function. For full control, the programmer can implement the algorithm in the application stage (on the CPU). Culling is often achieved by using geometric calculations but is in no way limited to these. For example, an algorithm may also use the contents of the frame buffer.
As we all know, visibility may be solved via a hardware construction called the Z-buffer. Even though it may solve visibility correctly, the Z-buffer is not a very smart mechanism in all respects. For example, it has the following implications. Imagine that the viewer is looking along a line where 10 spheres are placed. This is illustrated in Figure 1.
Figure 1. An illustration of how occlusion culling can be useful. Ten spheres are placed in a line, and the viewer is looking along this line (left). The depth complexity image in the middle shows that some pixels are written to several times, even though the final image (on the right) only shows one sphere.
An image rendered from this viewpoint will show but one sphere, even though all 10 spheres will be scan-converted and compared to the Z-buffer and then potentially written to the color buffer and Z-buffer. The simple conclusion in this case is that nine spheres will be drawn unnecessarily. This uninteresting scene is not that likely to be found in reality, but it describes (from its viewpoint) a densely populated model. These sorts of configurations are found in such real scenes as a rain forest, an engine, a city, and the inside of a skyscraper.
Thus it seems plausible that an algorithmic approach to avoid this kind of inefficiency may pay off in terms of speed. Such approaches go under the name of occlusion culling algorithms, since they try to cull away (avoid drawing) objects that are occluded, that is, inside the view frustum but not visible in the final image. The optimal occlusion culling algorithm would select only the objects that are visible. In a sense, the Z-buffer selects and renders only those objects which are visible, but not before all objects are sent through the pipeline. The idea behind efficient occlusion culling algorithms is to perform some simple tests early on and so avoid sending data through much of the pipeline.
Pseudocode for a general occlusion culling algorithm is shown in Figure 2, where the function isOccluded, often called the visibility test, checks whether an object is occluded. G is the set of geometrical objects to be rendered; OR is the occlusion representation.
3: for each object g in G
Figure 2. Pseudocode for a general occlusion culling algorithm. G contains all the objects in the scene, and OR is the occlusion representation.
Depending on the particular algorithm, OR represents some kind of occlusion information. OR is set to be empty at the beginning. After that all objects (which passed the view-frustum culling test) are processed.
Consider a particular object. First we test whether the object is occluded with respect to the occlusion representation OR. If it is occluded, then it is not processed further, since we then know that it will not contribute to the image. If the object is determined not to be occluded, then that object has to be rendered, since it probably contributes to the image (at that point in the rendering). Finally OR is updated with that object.
For some algorithms, it is expensive to update the occlusion representation, so this is only done once (before the actual rendering starts) with the objects that are believed to be good occluders. This set is then updated from frame to frame.
A number of occlusion algorithms will be scrutinized in this section.