Occlusion Culling Algorithms
May 25, 2017
Press Releases
May 25, 2017
Games Press

If you enjoy reading this site, you might also want to check out these UBM Tech sites:

Occlusion Culling Algorithms

November 9, 1999 Page 2 of 4

Hierarchical Z-Buffering and the Hierarchical Visibility Algorithm

One approach to occlusion culling is the hierarchical visibility (HV) algorithm [Greene93]. This algorithm maintains the scene model in an octree, and a frame's Z-buffer as an image pyramid, which we call a Z-pyramid. The octree enables hierarchical culling of occluded regions of the scene, and the Z pyramid enables hierarchical Z-buffering of individual primitives and bounding volumes. The Z-pyramid is thus the occlusion representation of this algorithm. Examples of these data structures are shown in Figure 3.

Any method can be employed for organizing scene primitives in an octree, although Greene et al. [Greene93] recommend a specific algorithm that avoids assigning small primitives to large octree nodes. In general, an octree is constructed by enclosing the entire scene in a minimal axis-aligned box. The rest of the procedure is recursive in nature, and starts by checking whether the box contains fewer than a threshold number of primitives. If it does, the algorithm binds the primitives to the box and then terminates the recursion. Otherwise, it subdivides the box along its main axes using three planes, thereby forming eight boxes (hence the name octree). Each new box is tested and possibly subdivided again into 2x2x2 smaller boxes. This process continues until each box contains fewer than the threshold number of primitives, or until the recursion has reached a specified deepest level [Samet89a,Samet89b]. This is illustrated in two dimensions, where the data structure is called a quadtree, in Figure 4.

 Figure 3. Example of occlusion culling with the hierarchical visibility algorithm [Greene95], showing a complex scene (lower right) with the corresponding Z-pyramid (on the left), and octree subdivision (upper right). By traversing the octree from front to back and culling occluded octree nodes as they are encountered, this algorithm only visits visible octree nodes and their children (the nodes portrayed at the upper right) and only renders the polygons in visible boxes. In this example, culling of occluded octree nodes reduces the depth complexity of the polygons that need to be rendered from 84 to 2.5. (Courtesy of Ned Greene/Apple Computer.)

The construction of an octree takes too much time to be done at runtime, so this method is best suited for static models.

Once the octree has been created, each frame is rendered in approximately front-to-back order by calling the procedure ProcessOctreeNode (outlined in Figure 5) with the root node of the octree. Octree nodes that are outside the view frustum are culled away. The first step determines whether the node's bounding box is visible with respect to the Z-pyramid, using a procedure that will be described later. In this case, a node's bounding box is a box in the octree. If the node is occluded, we do not need to process the contents of that box further, since its contents do not contribute to the final image. Otherwise, we render the primitives associated with the node into the Z-pyramid (tileInto in the pseudocode) and then process each of the node's children (if it has any) in front-to-back order using this same recursive procedure. When recursion finishes, all visible primitives have been tiled into the Z-pyramid, and a standard Z-buffer image of the scene has been created.

 Figure 4. The construction of a quadtree (which is the two-dimensional version of an octree). The construction starts from the left by enclosing all objects in a bounding box. Then the boxes are recursively divided into four equal-sized boxes until each box (in this case) is empty or contains one object.

The HV algorithm performs occlusion culling very efficiently because it only traverses visible octree nodes and their children, and it only renders the primitives in visible nodes. This can save much of the work in scenes that are densely occluded. For example, in the scene pictured in Figure 3, more than 99% of on-screen polygons are inside occluded octree nodes, which are therefore culled by the Z pyramid [Greene95].

 1: ProcessOctreeNode(OctreeNode N) 2: if(isOccluded(NBV, ZP)) then return; 3: for each primitive p in N 4: tileInto(p, ZP) 5: end 6: for each child node C in N in front-to-back order 7: ProcessOctreeNode(C) 8: end Figure 5. Pseudocode for the hierarchical visibility algorithm. To render a frame this procedure is called with the root node of the octree. NBV is the bounding volume of the octree node N, and ZP is the Z-pyramid that is the occlusion representation of this algorithm. The operation tileInto renders a primitive p into the Z-pyramid, ZP, and this also updates the entire Z-pyramid.

Now we will describe how the Z-pyramid is maintained and how it is used to accelerate culling. The finest (highest-resolution) level of the Z-pyramid is simply a standard Z-buffer. At all other levels, each z-value is the farthest z in the corresponding 2x2 window of the adjacent finer level. Therefore each z-value represents the farthest z for a square region of the screen. To maintain a Z-pyramid, whenever a z-value is overwritten in the Z-buffer it is propagated through the coarser levels of the Z-pyramid. This is done recursively until the top of the image pyramid is reached, where only one z-value remains (this is illustrated
in Figure 6.

 Figure 6. On the left, a 4x4 piece of the Z-buffer is shown. The numerical values are the actual z-values. This is downsampled to a 2x2 region where each value is the farthest (largest) of the four 2x2 regions on the left. Finally, the farthest value of the remaining four z-values is computed. These three maps compose an image pyramid which is called the hierarchical Z-buffer.

Next, we describe how hierarchical culling of octree nodes is done. To determine whether a node is visible, the front faces of its bounding box are tested against the Z-pyramid. The node is occluded if all of its front faces are occluded by the Z-pyramid. To establish whether an individual face is occluded, we begin at the coarsest Z-pyramid cell that encloses the face's screen projection. The face's nearest depth within the cell
(znear) is then compared to the Z-pyramid value, and if znear is farther, the face is known to be occluded. For densely occluded scenes, this procedure often culls an occluded face with a single depth comparison. When this initial test fails to cull a face, its visibility can be definitively established by recursively traversing from the initial Z-pyramid cell down to finer levels in the Z-pyramid. Additional depth comparisons at the encountered child cells must be performed. If this subdivision procedure does not ultimately find a visible sample on the face, the face is occluded. In scenes where the bounding boxes of octree nodes overlap deeply on the screen, this hierarchical procedure can establish visibility much more efficiently than can conventional Z-buffering. For example, in the scene pictured in Figure 3, hierarchical culling of octree nodes with the Z-pyramid generates roughly one hundred times fewer depth comparisons than visibility testing with conventional Z-buffering.

As we have seen, the original HV algorithm [Greene93] used a Z-pyramid only for occlusion tests and employed traditional Z-buffer scan conversion to render the polygons in visible octree nodes. Subsequently, a more efficient method, called hierarchical polygon tiling [Greene96], was developed. This algorithm adapts the basic screen subdivision procedure described above to polygon "tiling," that is, finding the visible samples on a polygon. When tiling a polygon into the Z-pyramid, it must be determined, at each level of subdivision where the polygon is compared to a Z-pyramid cell, whether the polygon overlaps the cell and if so, whether the polygon is occluded by the cell. Overlap tests are accelerated with coverage masks that indicate which subcells within a Z-pyramid cell are covered by the polygon, and occlusion tests are performed by comparing the polygon's nearest z-value within the cell to the Z-pyramid value. This procedure for hierarchical Z-buffering is very efficient, because it only traverses regions of the screen where a polygon is visible or nearly visible.

Hierarchical polygon tiling can be performed even more efficiently if the polygons in a scene are organized into a BSP tree or an "octree of BSP trees" [Greene96]. The reason is that this enables traversing polygons in strict front-to-back order, which eliminates the need to maintain depth information. Rather, the only occlusion information required is whether or not each image sample has been written, and this information can be maintained in an image pyramid of coverage masks called a coverage pyramid. This is the data structure maintained by hierarchical polygon tiling with coverage masks [Greene96], which is similar to hierarchical Z-buffering except that occlusion tests are performed with coverage-mask operations instead of depth comparisons, which accelerates tiling considerably.

The HV algorithm implemented with hierarchical tiling may well be the most efficient method known for software rendering of complex scenes composed of polygons, but it is not fast enough for real-time rendering of complex scenes on today's microprocessors. To enable real-time rendering, Greene et al. [Greene93] suggest modifying hardware Z-buffer pipelines to support HV, which requires substituting a Z-pyramid for the Z-buffer, and including a fast feedback path to report visibility of bounding volumes. It is likely that these modifications would extend the domain of real-time rendering to much more complex scenes, such as the scene in Figure 3.

In the absence of this kind of hardware support, the HV algorithm can be accelerated on systems having conventional Z-buffer hardware by exploiting frame-to-frame coherency [Greene93]. The idea is that octree nodes that were visible in one frame tend to be visible in the next. With this variation, the first frame of an animation sequence is generated with the standard HV algorithm, except that after completing the frame, a list of octree nodes that were visible in that frame (the visible node list) is created by testing nodes for visibility against the Z-pyramid. Subsequent frames are generated with the following two-pass algorithm. In the first rendering pass, primitives associated with nodes on the visible node list are rendered by Z-buffer hardware. Then, the Z-buffer of the partially rendered scene is read back from the hardware, and a Z-pyramid is built from this Z-buffer. In the second rendering pass, the standard HV algorithm is run in software, traversing the octree from front to back but skipping nodes which have already been rendered. This second pass fills in any missing parts of the scene. The final step in processing a frame is to update the visible node list. Typically, this variation of the HV algorithm runs considerably faster than the all-software version, because nearly all visible polygons are rendered with Z-buffer hardware.

Greene and Kass [Greene94b] have developed an extension to hierarchical Z-buffering which renders antialiased scenes with error bounds. Another interesting algorithm for occlusion culling is the visibility skeleton developed by Durand et al. [Durand97,Durand97b].

Page 2 of 4

### Related Jobs

Disruptor Beam — FRAMINGHAM, Massachusetts, United States
[05.25.17]

Senior Data Analyst
Intrepid Studios Inc — San Diego, California, United States
[05.24.17]

Associate Programmer
Heart Machine — Culver City, California, United States
[05.24.17]

Graphics Engineer
System Era Softworks — Seattle, Washington, United States
[05.24.17]

Senior Technical Designer