Contents
Sponsored Feature: Multi-Threaded Fluid Simulation for Games
 
 
Printer-Friendly VersionPrinter-Friendly Version
 


Part of:



[More information...]
 

Latest News
spacer View All spacer
 
November 22, 2009
 
Video Game Watchdog National Institute On Media And The Family Shutting Down [11]
 
Modern Warfare 2 Infinity Ward's 'Most Successful PC Version' Yet [13]
 
New Tech, Design Details Of Project Natal To Emerge At Gamefest In February
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
November 22, 2009
 
Trion Redwood City
Sr. Environment Artist
 
Trion Redwood City
Sr. Evnironment Modeler
 
Sucker Punch Productions
3D Environment Artist
 
Sucker Punch Productions
Network Programmer
 
Sucker Punch Productions
Texture Artist
 
Sucker Punch Productions
Character Artist
 
Crystal Dynamics
Sr. Level Designer
 
Monolith Productions
Sr. Software Engineer, Engine - Monolith Productions - #113767
spacer
Latest Features
spacer View All spacer
 
November 22, 2009
 
arrow Upping The Craft: Susan O'Connor On Games Writing [6]
 
arrow Small Developers: Minimizing Risks in Large Productions - Part II [7]
 
arrow iPhone Piracy: The Inside Story [49]
 
arrow And Yet It Grows: Analyzing the Size and Growth of the European Game Market [5]
 
arrow NPD: Behind the Numbers, October 2009 [13]
 
arrow Reflecting On Uncharted 2: How They Did It [5]
 
arrow Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency
 
arrow Postmortem: Wadjet Eye's The Blackwell Convergence [2]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
November 22, 2009
 
Time Fcuk [2]
 
Accepting the Inherent Value of Games
 
Planckogenesis, Part II: Song Structure & Gravy Train [1]
spacer
About
spacer News Director:
Leigh Alexander
Features Director:
Christian Nutt
Editor At Large:
Chris Remo
Advertising:
John 'Malik' Watson
Recruitment/Education:
Gina Gross
 
Features
  Sponsored Feature: Multi-Threaded Fluid Simulation for Games
by Quentin Froemke
8 comments
Share RSS
 
 
June 17, 2009 Article Start Previous Page 2 of 2
 


Figure 4. Breaking up a 2D grid into tasks and an individual cell adjacent to other tasks.

For example, in the simulation there is a diffusion step followed by a force update step and later, an advection step. At each step the grid is broken into pieces, and the tasks write their data to a shared output grid, which becomes the input into the next step. In most cases, this approach works well.

However, a problem can arise if the calculation uses its own results in the current step. That is, the output of cell 1 is used to calculate cell 2. Looking at Figure 4, it is obvious that certain cells, like the shaded cell, will be processed in a task separate from the task in which their immediate neighbors are processed.

Since each task uses the same input, the neighboring cells' data will be different in the multi-threaded version compared to the serial version. Whether this is an issue depends on which portion of the simulation we are in.

Inaccuracies in the results in the diffusion step can occur, but they are not noticeable and don't introduce destabilizing effects. In most cases the inaccuracies are smoothed out by subsequent iterations over the data and have no lasting impact on the simulation. This kind of behavior is acceptable because the primary concern is visual appeal, not numerical accuracy.

However, in other cases inaccuracies can accumulate, introducing a destabilizing effect on the simulation, which can quickly destroy any semblance of normal behavior. An example of this occurs while processing the edge cases in diffusion-where diffusion is calculated along the edges of the grid or cube.

When run in parallel, the density does not diffuse out into the rest of the volume and instead accumulates at the edges, forming a solid border. For this situation and similar ones, we simply run that portion serially.

The main computational load of the simulation occurs when each step loops over the grid. These are all nested loops, so a straightforward way to break them up is by using the "parallel for" construction of TBB. In the example case of density diffusion, the code went from the serial version shown in Figure 3 to the parallel version shown in Figures 5 and 6.


Figure 5. Using Intel Threading Building Blocks to call the diffusion algorithm.

Figure 5 shows how TBB is used to call the diffusion algorithm. The arguments uBegin and uEnd are the start and end values of the total range to process, and uGrainSize is how small to break the range into.

For example, if uBegin is 0 and uEnd is 99, the total range is 100 units. If uGrainSize is 10, 10 tasks will be submitted, each of which will process a range of 10.

Figure 6 shows the actual algorithm called by the TBB task manager. The range to process is passed in, and the input and output grids are variables common to each job. As a result of using multi-threading, the performance in frames per second was improved by 3.76× when going from one thread to eight threads on an Intel Core i7 processor.


Figure 6. A code sample that shows 3D, multi-threaded diffusion.

Rendering

Although the goal of the project was to take unused CPU cycles and use them to compute a CFD simulation, the results won't have much impact without a nice, visual representation.

There are many methods of rendering volumetric data, but we like the results we got using the ray casting method detailed in Real-Time Volume Graphics.3 Using this method, the volumetric data is copied into a 3D texture, which is used in a pixel shader on the GPU.

The simulation volume is rendered as a cube at the position of the volume in the 3D scene. For each visible pixel, a ray is cast from the camera to this pixel on the cube. The corresponding point in the 3D texture is found and sampled and the ray marches through the texture at a fixed step size, accumulating the density data as it goes.

Conclusion

With Kaboom, we created a modular fluid simulation that adds to the existing knowledge base. It operates in 3D, is independent of the rendering portion, and is multi-threaded to take advantage of multi-core processors.

Of course, there is still plenty of future work that someone can do with this code sample. On the simulation side, a new or modified algorithm could be used to remove the serial requirements.

Also, we found that a section of the code that does bilinear interpolation calculations takes up a significant amount of processing time, which could be reduced by storing the results in a lookup table.

Another interesting avenue to explore is the introduction of arbitrary boundaries. Since the simulation is running on the CPU it has access to and can react with the geometry data of the scene. The rendering could be enhanced with additional effects, such as shadowing or dynamic lighting.

Resources

For more information about Intel Threading Building Blocks, visit http://www.threadingbuildingblocks.org

3 Engel, Klaus, ed. Real-Time Volume Graphics, Wellesley, Massachusetts: A. K. Peters, Ltd., 2006.

 
Article Start Previous Page 2 of 2
 
Comments

Nicholas Sherlock
profile image
The executables zip file is protected by a password??

Mark Miller
profile image
My first guess was "intel" and I was right :-).

Simon Carless
profile image
Apologies - I'll post again when this is fixed.

Ondrej Spanel
profile image
The archive does not contain necessary dlls, and it even does not tell what dlls are needed or where could one get them.

So far I have found:

tbb.dll (Thread Building Blocks)
MSVCP80.dll (Some version of MSVC runtime)

Not knowing where to get them, I am unable to run the demo.

Jeff Freeman
profile image
The accompanying Readme.txt files notes build requirements. Legal requirements prevented us from including some of the redistributable binaries.

Cheers,
Jeff

Running the Application
-----------------------

Redistributables
----------------

If Visual Studio 2008 SP1 and the DirectX SDK (November 2008) are not installed, you
will need to download and install the following redistributables:

Visual C++ 9
http://www.microsoft.com/downloads/details.aspx?familyid=A5C84275-3B97-4AB7-A40D
-3802B2AF5FC2&displaylang=en


DirectX November 2008
http://www.microsoft.com/DOWNLOADS/details.aspx?displaylang=en&FamilyID=886acb56
-c91a-4a8e-8bb8-9f20f1244a8e


Intel(R) Threading Building Blocks
----------------------------------

Download a release of TBB from: http://www.threadingbuildingblocks.org/
and copy tbb.dll and tbbmalloc.dll from the download to the same directory containing
FluidRender2D.exe and FluidRender3D.exe.

Simon Carless
profile image
The password issue is fixed, incidentally, thanks for being patient.

Robert Allen
profile image
Again I'll ask: why didn't Intel use the POSIX thread model? I'm curious.

Jeff Freeman
profile image
Robert,

We wanted to highlight a parallel programming approach without the need for a custom thread pool model. Often times such implementations become designed around functional versus task based approaches that otherwise artificially limit (undersubscribe) available CPU cores. The task based approach to parallelization taken here exploits TBB's ability to fully subscribe available CPU cores, taking equal advantage of available CPU horsepower on today's multi-core processors and beyond. TBB is an excellent fit here since it's Cilk-based task stealing approach lets the simulation scale with available CPU cores.

Cheers,
Jeff
Software Engineer/Visual Computing Software Division/Intel


none
 
Comment:
 


Submit Comment