|
Excerpted
from
Real-Time Rendering
(AK Peters, 1999)
|
|
|
One
of the great myths concerning computers is that one day we will have
enough processing power. Even in a relatively simple application like
word processing, we find that additional power can be applied to all
sorts of things, such as on-the-fly spell and grammar checking, more
elaborate graphic presentation, antialiased text display, automated
voice recognition and dictation, etc.
In
real-time rendering we have at least three performance goals: more frames
per second, higher resolution, and more (and more realistic) objects
in the scene. A speed of 60-72 frames per second is generally considered
enough, and perhaps 1600x1200 is enough resolution for a while, but
there is no real upper limit on scene complexity. The rendering of a
Boeing-777 would include 132,500 unique parts and over 3,000,000 fasteners,
which would yield a polygonal model with over 500,000,000 polygons [Cripe98].
Our conclusion: speed-up techniques and acceleration schemes will always
be needed.
In this
article, we will talk about a certain class of acceleration scheme called
occlusion culling techniques. Most of this article is an excerpt
from chapter 7, "Speed-Up Techniques" from our book Real-Time
Rendering (www.realtimerendering.com
or www.acm.org/tog/resources/RTR/).
In the book, the occlusion culling section is preceded by sections on
backface and clustered culling, hierarchical view-frustum culling, portal
culling, and detail culling. Sections on impostor algorithms, level-of-detail
techniques, triangle fan, strip and polygon mesh techniques follow after.
To cull
can mean to "select from a flock," and in the context of computer
graphics this is exactly what culling techniques do. The flock
is the whole scene that we want to render, and the selection is limited
to those portions of the scene that are not considered to contribute
to the final image. The rest of the scene is sent through the rendering
pipeline. The actual culling can theoretically take place at any stage
of the rendering pipeline. For culling algorithms that are implemented
in hardware, we can typically
only enable/disable or set some parameters for the culling function.
For full control, the programmer can implement the algorithm in the
application stage (on the CPU). Culling is often achieved by using geometric
calculations but is in no way limited to these. For example, an algorithm
may also use the contents of the frame buffer.
As we
all know, visibility may be solved via a hardware construction called
the Z-buffer. Even though it may solve visibility correctly, the Z-buffer
is not a very smart mechanism in all respects. For example, it has the
following implications. Imagine that the viewer is looking along a line
where 10 spheres are placed. This is illustrated in Figure 1.
|
|
|
Figure
1. An illustration of how occlusion culling can be useful. Ten
spheres are placed in a line, and the viewer is looking along
this line (left). The depth complexity image in the middle shows
that some pixels are written to several times, even though the
final image (on the right) only shows one sphere.
|
An image
rendered from this viewpoint will show but one sphere, even though all
10 spheres will be scan-converted and compared to the Z-buffer and then
potentially written to the color buffer and Z-buffer. The simple conclusion
in this case is that nine spheres will be drawn unnecessarily. This
uninteresting scene is not that likely to be found in reality, but it
describes (from its viewpoint) a densely populated model. These sorts
of configurations are found in such real scenes as a rain forest, an
engine, a city, and the inside of a skyscraper.
Thus it
seems plausible that an algorithmic approach to avoid this kind of inefficiency
may pay off in terms of speed. Such approaches go under the name of
occlusion culling algorithms, since they try to cull away (avoid
drawing) objects that are occluded, that is, inside the view frustum
but not visible in the final image. The optimal occlusion culling algorithm
would select only the objects that are visible. In a sense, the Z-buffer
selects and renders only those objects which are visible, but not before
all objects are sent through the pipeline. The idea behind efficient
occlusion culling algorithms is to perform some simple tests early on
and so avoid sending data through much of the pipeline.
Pseudocode
for a general occlusion culling algorithm is shown in Figure 2, where
the function isOccluded, often called the visibility test, checks
whether an object is occluded. G is the set of geometrical objects
to be rendered; OR is the occlusion representation.
1: OcclusionCullingAlgorithm
(G)
2: OR=empty
3: for each object g in G
4: if(isOccluded(g,OR))
5: Skip(g)
6: else
7: Render(g)
8: Update(OR,g)
9: end
10: end |
|
Figure
2. Pseudocode for a general occlusion culling algorithm. G contains
all the objects in the scene, and OR is the occlusion representation.
|
Depending
on the particular algorithm, OR represents some kind of occlusion
information. OR is set to be empty at the beginning. After that
all objects (which passed the view-frustum culling test) are processed.
Consider
a particular object. First we test whether the object is occluded with
respect to the occlusion representation OR. If it is occluded,
then it is not processed further, since we then know that it will not
contribute to the image. If the object is determined not to be occluded,
then that object has to be rendered, since it probably contributes to
the image (at that point in the rendering). Finally OR is updated
with that object.
For some
algorithms, it is expensive to update the occlusion representation,
so this is only done once (before the actual rendering starts) with
the objects that are believed to be good occluders. This set is then
updated from frame to frame.
A number
of occlusion algorithms will be scrutinized in this section.
|