It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.

Search articles, jobs, buyers guide, and more.

By Sean O'Neil
Gamasutra
[Author's Bio]
August 10, 2001

Introduction

Mapping ROAM to a Sphere

Looking at the Code

Printer Friendly Version
 
Discuss this Article

 

Letters to the Editor:
Write a letter
View all letters


Features

A Real-Time Procedural Universe, Part Two: Rendering Planetary Bodies

Mapping ROAM to a Sphere

Because the traditional ROAM algorithm only allows you to split a square, you have to start with a solid 3D shape that consists only of squares. To prevent cracks from forming in the mesh, the edges of these squares must fit together seamlessly in all directions. Each square in this shape will be treated like a top-level square, and each triangle like the root of a separate binary tree. Neighbor pointers have to be set up between these top-level triangles properly so that splitting at the edge of one top-level square can split the corresponding triangle on the adjacent edge.

The obvious 3D shape that consists of all squares with edges wrapping seamlessly in all directions is a cube. Because a subdivided cube doesn't give a well-proportioned sphere, I spent a lot of time trying to come up with a more appropriate shape, but I couldn't find anything that fit the requirements. Later on, I realized that I was wasting my time because the split priorities would quickly take care of the sphere's proportions. If there is too much error in one triangle because its longest edge is out of proportion, it will be split.


Figure 2 - Screenshots of the spherical ROAM algorithm at different levels of detail.

Once you set up the 12 root triangles of the cube with the proper neighbor pointers, everything falls into place and the traditional ROAM algorithm works beautifully. In fact, it's even easier to implement than the flat ROAM algorithm because you don't have to test for conditions at the edges of the map. The only thing you need to treat differently is how you implement your height values. Instead of simply adjusting a vertex's y coordinate, you need to treat your height value as an offset from the sphere's radius. Mathematically, treat each new vertex as a direction vector from the center of the sphere. Normalize the vector and multiply it by the sphere's radius plus the height offset.

One other simplification I made to the ROAM algorithm was to use a linked list of triangles instead of a binary tree. The tree structure takes up more memory, slows the routine down, and is more complicated to implement. A linked list is sufficient as long as each triangle keeps a pointer to its edge neighbors and the triangle it was split from (its parent). The list is initialized with the 12 top-level triangles, which have their edge pointers set up properly and their parent pointers set to NULL. When you split a square, add two new triangles to the list with their parent pointers set to the two existing triangles, then add a vertex and rearrange the vertices and edge pointers of the four triangles. Do the reverse when merging a diamond.

To simplify the split and merge operations, it is a good idea to maintain the same order of the vertices and neighbor pointers in the triangle object. I chose to order my vertices (0, 1, 2) counter-clockwise with vertex 1 always being opposite the longest edge. My edge neighbor pointers are ordered based on the edge shared, 0-1, 1-2, and 2-0 respectively. Since vertex 1 is opposite the longest edge, this means that the last edge pointer (i.e. array index 2) always shares the longest edge.


Figure 3: Vertex and edge pointer indexes (edges in red). When splitting, go from left to right. When merging, go from right to left.

Note from Figure 3 that it's not that hard to determine whether a triangle is part of a square or diamond using edge pointers. If triangle1->edge[2]->edge[2] = triangle1, then triangle1 is part of a square. Otherwise, you need to split triangle1->edge[2] to make triangle1 part of a square. If triangle1->edge[0]->edge[0]->edge[0]->edge[0] = triangle1 (you can also replace those 0's with 1's), then the triangle is part of a diamond.


Optimizing the Update() function

Now that you have the basic idea behind my spherical ROAM algorithm, you need to know how to optimize it for a dynamic planet generation algorithm like the one I explained in Part One. The first key to making it fast enough for real-time rendering is to realize that the fractal function you're calling is very slow. Even if you change it to use some other type of algorithm, it is relatively safe to assume that it won't be a cheap function to call. Your only hope to get it to run fast enough is to minimize the number of calls to that function.

Keep in mind that you have to update the priority of every triangle in the mesh each frame to determine whether it needs to be split or not. That priority depends on the height of the new vertex, which is determined by calling the fractal function. So to keep from having to call the fractal function for every triangle every frame, you need cache the new height value in your triangle object. So as each triangle is created from a split, you should call the fractal function to get the height value for the next potential split.

Once I had that implemented, I realized I was still calling the fractal routine twice as often as necessary. Because each triangle shares its longest edge with another triangle in a square, the second triangle to be created for that square was making the same call to the fractal routine. It's easy enough to determine whether a triangle is completing a square at the time of its creation, and if so borrow the calculated height value from its neighbor. Also note that there is also no need to call the fractal routine when you merge a diamond because the vertex you are removing has the information you need. Just take the vertex's height value and use it to compute the offset to put into the merged triangle objects.

Next I focused on the priority calculations. The first thing to keep in mind is that if the merge priority is not calculated in the same way the split priority is calculated for a triangle's parent, you quickly get into a situation where each frame you will split several triangles only to have them merged together again right away. You can also get into a situation where unneeded triangles are kept around long after they're needed. Because I'm not using a binary tree, I'm not keeping track of the original state of the parent triangles. This was easy enough to solve by adding a few members to my diamond object to keep track of the necessary information.

There are a number of other changes I can think of off the top of my head that could theoretically speed it up even more. The simplest way is to come up with some logic to avoid doing a priority check on every triangle every frame. At 50,000 triangles, this can be quite a burden on the CPU. I recently read an interesting article that described an improved split priority concept. In this article, the priority calculations were pre-calculated to determine the distance at which a triangle would be split or a diamond merged, and updates were driven mainly off of distance checks. The author then grouped polygons together into a tree using bounding spheres to greatly reduce the number of distance checks that needed to be made each frame. I'm not clear on the details surrounding their implementation, but they reported a dramatic improvement in their Update() function.


Optimizing the Draw() function


Next I concentrated on the Draw() function, and switched from direct calls to glVertex, glNormal, and glTexCoord to a vertex array. I created a dynamically sizing array class, and made it global so that all ROAM objects use the same vertex array. This optimization gave a noticeable boost in frame rate, which illustrated to me that my optimizations to the Update() function had paid off and the bottleneck had moved over to the rendering pipeline. If the rest of the code had been poorly optimized, I would've seen a very small improvement optimizing the rendering pipeline.

Then I looked at the Nvidia documentation to find the fastest way to push non-static triangles to the video card. I found that because I had already implemented a vertex array, it took very little effort to implement the GL_NV_vertex_array_range and GL_NV_fence extensions. These extensions allow you to copy vertices directly to AGP or video memory, and when you render with the standard glDrawElements function, it renders them extremely quickly. I believe the effect is the same as using DirectX's vertex buffers stored in video memory.

Given all these optimizations, I get about 20 FPS rendering 50,000 triangles using a 750 MHz Duron with a GeForce2 MX video card. When I toggle the mesh updates (i.e. skipping the Update() call) on the same machine, it speeds up to 45 FPS with 50,000 triangles, so there's still room for improvement in the update routine. The only further optimizations I can think of making to the Draw() function would be to see if I could convert my triangle lists to a set of strips or fans, but that would require even more changes to the Update() function to maintain the list of strips and fans. Given the frame rates I'm currently getting, I can't imagine it would improve performance much unless I came up with a simple way to maintain them.

______________________________________________________

Looking at the Code


join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2003 CMP Media LLC

privacy policy
| terms of service