|
Hierarchical
Z-Buffering and the Hierarchical Visibility Algorithm
One
approach to occlusion culling is the hierarchical visibility
(HV) algorithm [Greene93]. This algorithm maintains the scene model
in an octree, and a frame's Z-buffer as an image pyramid, which we call
a Z-pyramid. The octree enables hierarchical culling of occluded regions
of the scene, and the Z pyramid enables hierarchical Z-buffering of
individual primitives and bounding volumes. The Z-pyramid is thus the
occlusion representation of this algorithm. Examples of these data structures
are shown in Figure 3.
Any
method can be employed for organizing scene primitives in an octree,
although Greene et al. [Greene93] recommend a specific algorithm that
avoids assigning small primitives to large octree nodes. In general,
an octree is constructed by enclosing the entire scene in a minimal
axis-aligned box. The rest of the procedure is recursive in nature,
and starts by checking whether the box contains fewer than a threshold
number of primitives. If it does, the algorithm binds the primitives
to the box and then terminates the recursion. Otherwise, it subdivides
the box along its main axes using three planes, thereby forming eight
boxes (hence the name octree). Each new box is tested and possibly subdivided
again into 2x2x2 smaller boxes. This process continues until each box
contains fewer than the threshold number of primitives, or until the
recursion has reached a specified deepest level [Samet89a,Samet89b].
This is illustrated in two dimensions, where the data structure is called
a quadtree, in Figure 4.
|
|
|
Figure
3. Example of occlusion culling with the hierarchical visibility
algorithm [Greene95], showing a complex scene (lower right) with
the corresponding Z-pyramid (on the left), and octree subdivision
(upper right). By traversing the octree from front to back and
culling occluded octree nodes as they are encountered, this algorithm
only visits visible octree nodes and their children (the nodes
portrayed at the upper right) and only renders the polygons in
visible boxes. In this example, culling of occluded octree nodes
reduces the depth complexity of the polygons that need to be rendered
from 84 to 2.5. (Courtesy of Ned Greene/Apple Computer.)
|
The construction
of an octree takes too much time to be done at runtime, so this method
is best suited for static models.
Once the
octree has been created, each frame is rendered in approximately front-to-back
order by calling the procedure ProcessOctreeNode (outlined in Figure
5) with the root node of the octree. Octree nodes that are outside the
view frustum are culled away. The first step determines whether the
node's bounding box is visible with respect to the Z-pyramid, using
a procedure that will be described later. In this case, a node's bounding
box is a box in the octree. If the node is occluded, we do not need
to process the contents of that box further, since its contents do not
contribute to the final image. Otherwise, we render the primitives associated
with the node into the Z-pyramid (tileInto in the pseudocode)
and then process each of the node's children (if it has any) in front-to-back
order using this same recursive procedure. When recursion finishes,
all visible primitives have been tiled into the Z-pyramid, and
a standard Z-buffer image of the scene has been created.
 |
|
Figure
4. The construction of a quadtree (which is the two-dimensional
version of an octree). The construction starts from the left by
enclosing all objects in a bounding box. Then the boxes are recursively
divided into four equal-sized boxes until each box (in this case)
is empty or contains one object.
|
The HV
algorithm performs occlusion culling very efficiently because it only
traverses visible octree nodes and their children, and it only renders
the primitives in visible nodes. This can save much of the work in scenes
that are densely occluded. For example, in the scene pictured in Figure
3, more than 99% of on-screen polygons are inside occluded octree nodes,
which are therefore culled by the Z pyramid [Greene95].
1: ProcessOctreeNode(OctreeNode
N)
2: if(isOccluded(NBV, ZP)) then return;
3: for each primitive p in N
4: tileInto(p, ZP)
5: end
6: for each child node C in N in front-to-back order
7: ProcessOctreeNode(C)
8: end |
|
Figure
5. Pseudocode for the hierarchical visibility algorithm. To render
a frame this procedure is called with the root node of the octree.
NBV is the bounding volume of the
octree node N, and ZP is
the Z-pyramid that is the occlusion representation of this algorithm.
The operation tileInto renders a primitive p into the Z-pyramid,
ZP, and this also updates the entire
Z-pyramid.
|
Now we
will describe how the Z-pyramid is maintained and how it is used to
accelerate culling. The finest (highest-resolution) level of the Z-pyramid
is simply a standard Z-buffer. At all other levels, each z-value is
the farthest z in the corresponding 2x2 window of the adjacent finer
level. Therefore each z-value represents the farthest z for a square
region of the screen. To maintain a Z-pyramid, whenever a z-value is
overwritten in the Z-buffer it is propagated through the coarser levels
of the Z-pyramid. This is done recursively until the top of the image
pyramid is reached, where only one z-value remains (this is illustrated
in Figure 6.
|
|
|
Figure
6. On the left, a 4x4 piece of the Z-buffer is shown. The numerical
values are the actual z-values. This is downsampled to a 2x2 region
where each value is the farthest (largest) of the four 2x2 regions
on the left. Finally, the farthest value of the remaining four
z-values is computed. These three maps compose an image pyramid
which is called the hierarchical Z-buffer.
|
Next,
we describe how hierarchical culling of octree nodes is done. To determine
whether a node is visible, the front faces of its bounding box are tested
against the Z-pyramid. The node is occluded if all of its front faces
are occluded by the Z-pyramid. To establish whether an individual face
is occluded, we begin at the coarsest Z-pyramid cell that encloses the
face's screen projection. The face's nearest depth within the cell
(znear) is then compared to the Z-pyramid value, and if znear is farther,
the face is known to be occluded. For densely occluded scenes, this
procedure often culls an occluded face with a single depth comparison.
When this initial test fails to cull a face, its visibility can be definitively
established by recursively traversing from the initial Z-pyramid cell
down to finer levels in the Z-pyramid. Additional depth comparisons
at the encountered child cells must be performed. If this subdivision
procedure does not ultimately find a visible sample on the face, the
face is occluded. In scenes where the bounding boxes of octree nodes
overlap deeply on the screen, this hierarchical procedure can establish
visibility much more efficiently than can conventional Z-buffering.
For example, in the scene pictured in Figure 3, hierarchical culling
of octree nodes with the Z-pyramid generates roughly one hundred times
fewer depth comparisons than visibility testing with conventional Z-buffering.
As we
have seen, the original HV algorithm [Greene93] used a Z-pyramid only
for occlusion tests and employed traditional Z-buffer scan conversion
to render the polygons in visible octree nodes. Subsequently, a more
efficient method, called hierarchical polygon tiling [Greene96],
was developed. This algorithm adapts the basic screen subdivision procedure
described above to polygon "tiling," that is, finding the
visible samples on a polygon. When tiling a polygon into the Z-pyramid,
it must be determined, at each level of subdivision where the polygon
is compared to a Z-pyramid cell, whether the polygon overlaps the cell
and if so, whether the polygon is occluded by the cell. Overlap tests
are accelerated with coverage masks that indicate which subcells within
a Z-pyramid cell are covered by the polygon, and occlusion tests are
performed by comparing the polygon's nearest z-value within the cell
to the Z-pyramid value. This procedure for hierarchical Z-buffering
is very efficient, because it only traverses regions of the screen where
a polygon is visible or nearly visible.
Hierarchical
polygon tiling can be performed even more efficiently if the polygons
in a scene are organized into a BSP tree or an "octree of BSP trees"
[Greene96]. The reason is that this enables traversing polygons in strict
front-to-back order, which eliminates the need to maintain depth information.
Rather, the only occlusion information required is whether or not each
image sample has been written, and this information can be maintained
in an image pyramid of coverage masks called a coverage pyramid.
This is the data structure maintained by hierarchical polygon tiling
with coverage masks [Greene96], which is similar to hierarchical Z-buffering
except that occlusion tests are performed with coverage-mask operations
instead of depth comparisons, which accelerates tiling considerably.
The HV
algorithm implemented with hierarchical tiling may well be the most
efficient method known for software rendering of complex scenes composed
of polygons, but it is not fast enough for real-time rendering of complex
scenes on today's microprocessors. To enable real-time rendering, Greene
et al. [Greene93] suggest modifying hardware Z-buffer pipelines to support
HV, which requires substituting a Z-pyramid for the Z-buffer, and including
a fast feedback path to report visibility of bounding volumes. It is
likely that these modifications would extend the domain of real-time
rendering to much more complex scenes, such as the scene in Figure 3.
In the
absence of this kind of hardware support, the HV algorithm can be accelerated
on systems having conventional Z-buffer hardware by exploiting frame-to-frame
coherency [Greene93]. The idea is that octree nodes that were visible
in one frame tend to be visible in the next. With this variation, the
first frame of an animation sequence is generated with the standard
HV algorithm, except that after completing the frame, a list of octree
nodes that were visible in that frame (the visible node list) is created
by testing nodes for visibility against the Z-pyramid. Subsequent frames
are generated with the following two-pass algorithm. In the first rendering
pass, primitives associated with nodes on the visible node list are
rendered by Z-buffer hardware. Then, the Z-buffer of the partially rendered
scene is read back from the hardware, and a Z-pyramid is built from
this Z-buffer. In the second rendering pass, the standard HV algorithm
is run in software, traversing the octree from front to back but skipping
nodes which have already been rendered. This second pass fills in any
missing parts of the scene. The final step in processing a frame is
to update the visible node list. Typically, this variation of the HV
algorithm runs considerably faster than the all-software version, because
nearly all visible polygons are rendered with Z-buffer hardware.
Greene
and Kass [Greene94b] have developed an extension to hierarchical Z-buffering
which renders antialiased scenes with error bounds. Another interesting
algorithm for occlusion culling is the visibility skeleton developed
by Durand et al. [Durand97,Durand97b].
|