Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Gamasutra: The Art & Business of Making Gamesspacer
Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency
View All     RSS
October 21, 2020
arrowPress Releases
October 21, 2020
Games Press
View All     RSS

If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency

November 13, 2009 Article Start Previous Page 3 of 6 Next

Tile Assignment

As noted earlier, Larrabee uses "chunked" (also known as "binned" or "tiled") rendering, where the target is divided into multiple rectangles or tiles. The rendering commands are sorted according to the tiles they touch and stored in the corresponding bins, then the contents of each bin are rendered separately to the corresponding tile. It's a bit complex, but it considerably improves cache coherence and parallelization.

For chunking, rasterization consists of two steps: The first identifies which tiles a triangle touches, and the second rasterizes the triangle within each tile. So it's a two-stage process and I'm going to discuss the two stages separately.

Figure 12 shows an example of a triangle to be drawn to a tiled render target. The light blue area is a 256×256 render target, subdivided into four 128×128 tiles.

With Larrabee's chunking architecture, the first step in rasterizing the triangle in Figure 12 is to determine which tiles the triangle touches and put the triangle in the bins for those tiles (tiles 1 and 3). Once all triangles have been binned, the second step, intra-tile rasterization, is to rasterize the triangle to tile 1 when the tile 1 bin is rendered, and to rasterize the triangle to tile 3 when the tile 3 bin is rendered.

Assignment of triangles to tiles can easily be performed for relatively small triangles - say, up to a tile in size, which covers 90% of all triangles - by doing bounding box tests. For example, it would be easy with bounding box tests to find out what two tiles the triangle in Figure 12 is in. Larger triangles are currently assigned to tiles by simply walking the bounding box and testing each tile against the triangle; that doesn't sound very efficient, but tile-assignment time is generally an insignificant part of total rendering time for larger triangles because there's usually a lot of shading work and the like to do for those triangles. However, if large-triangle tile-assignment time does turn out to be significant, we could use a sweep approach, as discussed earlier, or a variation of the hierarchical approach used for intra-tile rasterization, which I'll discuss next. This is a good example of how a CPU makes it easy to use two completely different approaches in order to do the 90% case well and the 10% case adequately (or well, but in a different way), rather than having to have one size fit all.

Large-triangle assignment to tiles is performed with scalar code for simplicity and because it's not a significant performance factor. Let's look at how that scalar process works because it will help us understand vectorized intra-tile rasterization later. I'll use a small triangle for the example to make the figures legible, but as previously noted, such a small triangle normally would be assigned to its tile or tiles using bounding box tests.

Once we've set up the equation for an edge (by calculating B and C, as discussed when we looked at Figure 1), the first thing we do is calculate its value at the trivial reject corner of each tile. The trivial reject corner is the corner at which an edge's equation is most negative within a tile; the selection of the trivial reject corner for a given edge is based on its slope, as we'll see shortly. We set things up so that negative means "inside" in order to allow us to generate masks directly from the sign bit. So you can think of the trivial reject corner as the point in the tile that's most inside the edge. If this point isn't inside the edge, no point in the tile can be inside the edge, and therefore, the whole triangle can be ignored for that tile.

Figure 13 shows the trivial reject test in action. Tile 0 is trivially rejected for the black edge and can be ignored because its trivial reject corner is positive, and therefore, the whole tile must be positive and must lie outside the triangle. Meanwhile, the other three tiles must be investigated further. You can see here how the trivial reject corner is the corner of each tile most inside the black edge; that is, the point with the most negative value in the tile.

Note that determining which corner is the trivial reject corner will vary from edge to edge depending on slope. For example, it would be the lower left corner of each tile for the edge shown in red in Figure 14 because that's the corner that's most inside that edge.

If you understand what we've just discussed, you're ninety percent of the way to understanding the whole Larrabee rasterizer. The trivial reject test is actually very straightforward once you understand it - it's just a matter of evaluating the sign of a simple equation at the right point - but it can take a little while to get it, so you may find it useful to re-read the previous section if you're at all uncertain or confused.

So that's the tile trivial reject test. The other tile test is the trivial accept test. For this, we take the value at the trivial reject corner (the corner we just discussed) and add the amount that the edge equation changes for a step all the way to the diagonally opposite tile corner, the tile trivial accept corner. This is the point in the tile where the edge equation is most positive. You can think of this as the point in the tile that's most outside the edge. If the trivial accept corner for an edge is negative, that whole tile is trivially accepted for that edge and there's no need to consider that edge when rasterizing within the tile.

Figure 15 shows the trivial accept test in action. Because the trivial accept corner is the corner at which the edge's equation is most positive, if this point is negative - and therefore inside the edge - all points in the tile must be inside the edge. Thus, tiles 0 and 1 are not trivially accepted for the black edge, because the equation for the black edge is positive at their trivial accept corners, but tiles 2 and 3 are trivially accepted, so rasterization of this triangle in tiles 2 and 3 can ignore the black edge entirely, saving a good bit of work.

There's an important asymmetry here. When we looked at trivial reject, we saw that it applies to the whole triangle in the tile being drawn to, in the sense that trivially rejecting a tile against any edge means the triangle doesn't touch that tile, so the triangle can be ignored in that tile. However, trivial accept applies only to the edge being checked; that is, trivially accepting a tile against an edge only means that the specific edge doesn't have to be checked when rasterizing the triangle to that tile because the whole tile is inside that edge; it has no direct implication for the whole triangle. The tile may be trivially accepted against one edge, but not against the others. In fact, it may be trivially rejected against one or both of the other edges, in which case, the triangle won't be drawn to the tile at all. This is illustrated in Figure 16, where tile 3 is trivially accepted against the black edge, so the black edge wouldn't need to be considered in rasterizing the triangle to that tile, but tile 3 is trivially rejected against the red edge, and that means that the triangle doesn't have to be rasterized to tile 3 at all.

In Figure 17, however, tile 3 is trivially accepted by all three edges, and here we come to a key point.

If all three edges are negative at their respective trivial accept corners, then the whole tile is inside the triangle, and no further rasterization tests are needed - and this is what I meant earlier when I said the rasterizer takes advantage of CPU smarts by not-rasterizing whenever possible. The tile-assignment code can just store a draw-whole-tile command in the bin. Then the bin rendering code can simply do the equivalent of two nested loops around the shaders, resulting in a fullscreen triangle rasterization speed of approximately infinity - one of my favorite performance numbers!

By the way, this whole process should be familiar to 3D programmers because testing of bounding boxes against planes (for example, for frustum culling) is normally done in exactly the same way - although in three dimensions instead of two - with the same use of signs to indicate inside and outside for trivial accept and reject. Also, structures such as octrees employ a 3D version of the hierarchical recursion used by the Larrabee rasterizer.

That completes our overview of how rasterization of large triangles for tile assignment works. As I said, this is done as a scalar evaluation in the Larrabee pipeline, so the trivial accept and reject tests for each tile are performed separately. Intra-tile rasterization, to which we turn next, is much the same, but vectorized. And it is this vectorization that will give us insight into applying the Larrabee New Instructions to semi-parallel tasks.

Article Start Previous Page 3 of 6 Next

Related Jobs

innogames — Hamburg, Germany

Mobile Software Developer (C++) - Video Game: Forge of Empires
Johnson County Community College
Johnson County Community College — Overland Park, Kansas, United States

Assistant Professor, Game Development
Insomniac Games
Insomniac Games — Burbank, California, United States

Lead Gameplay Programmer
Insomniac Games
Insomniac Games — Burbank, California, United States

Lead Engine Programmer

Loading Comments

loader image