Contents
Sponsored Feature: Optimizing Game Architectures with Intel Threading Building Blocks
 
 
Printer-Friendly VersionPrinter-Friendly Version
 


Part of:



[More information...]
 

Latest News
spacer View All spacer
 
November 22, 2009
 
Video Game Watchdog National Institute On Media And The Family Shutting Down [11]
 
Modern Warfare 2 Infinity Ward's 'Most Successful PC Version' Yet [13]
 
New Tech, Design Details Of Project Natal To Emerge At Gamefest In February
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
November 22, 2009
 
Trion Redwood City
Sr. Environment Artist
 
Trion Redwood City
Sr. Evnironment Modeler
 
Sucker Punch Productions
3D Environment Artist
 
Sucker Punch Productions
Network Programmer
 
Sucker Punch Productions
Texture Artist
 
Sucker Punch Productions
Character Artist
 
Crystal Dynamics
Sr. Level Designer
 
Monolith Productions
Sr. Software Engineer, Engine - Monolith Productions - #113767
spacer
Latest Features
spacer View All spacer
 
November 22, 2009
 
arrow Upping The Craft: Susan O'Connor On Games Writing [6]
 
arrow Small Developers: Minimizing Risks in Large Productions - Part II [7]
 
arrow iPhone Piracy: The Inside Story [50]
 
arrow And Yet It Grows: Analyzing the Size and Growth of the European Game Market [5]
 
arrow NPD: Behind the Numbers, October 2009 [13]
 
arrow Reflecting On Uncharted 2: How They Did It [5]
 
arrow Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency
 
arrow Postmortem: Wadjet Eye's The Blackwell Convergence [2]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
November 22, 2009
 
Time Fcuk - A Postmortem [2]
 
Accepting the Inherent Value of Games
 
Planckogenesis, Part II: Song Structure & Gravy Train [1]
spacer
About
spacer News Director:
Leigh Alexander
Features Director:
Christian Nutt
Editor At Large:
Chris Remo
Advertising:
John 'Malik' Watson
Recruitment/Education:
Gina Gross
 
Features
  Sponsored Feature: Optimizing Game Architectures with Intel Threading Building Blocks
by Brad Werth
0 comments
Share RSS
 
 
March 30, 2009 Article Start Previous Page 2 of 5 Next
 

Knee Deep with Loop Parallelism

Thus far, the focus has been on optimizations that leave the original code structure almost completely intact. The next level of optimizations requires more significant localized code changes. In return for these modifications, Intel TBB can provide considerably enhanced performance on multi-core processors.

parallel_for

Like most computationally intensive programs, games make heavy use of loops. Loops are a natural opportunity to optimize execution on a multi-core system. Intel TBB provides a scheme for parallelizing loops with a simple API. As expected, this optimization provides significant performance gains over unmodified, serial loops. Even when the original code has already been parallelized, Intel TBB's implementation can sometimes provide additional performance benefits due to its efficient use of hardware resources.

Sample 3

void doSerialStandardTest(double *aLoopTimes, const Kernel *pKernel)
{
...
// don't use a thread at all
pKernel->process(0, kiReps);
...
}

void doParallelForTBBTest(double *aLoopTimes, const Kernel *pKernel)
{
...
TBBKernelWrapper tWrapper(pKernel);
tbb::parallel_for(
tbb::blocked_range<int>(0, kiReps),
tWrapper
);
...
}

Sample 3 shows how Intel TBB's parallel_for function can be applied to a loop for significant performance benefits on multi-core CPUs. The code using the parallel_for shows a near linear speedup of 3.98 relative to the serial loop when run on the 4-core system.

Other parallel loop patterns

In addition to parallel_for, TBB has other functions for parallelizing other types of loops. The function parallel_reduce handles loops that are combining results from multiple iterations. The function parallel_do handles iterator-based loops. There are also functions to handle sorting, pipelined execution, and other loop-like operations.

All-in with Generalized Task Parallelism

The techniques demonstrated in the first two sections are appropriate for developers looking to use Intel TBB piecemeal and to achieve modest performance gains as a result. Even greater performance gains are possible when Intel TBB is used as the foundation of a game's threading architecture. This ensures that any explicit functional parallelism and the data parallelism supported by Intel TBB use the same threads, which avoids oversubscription and maximizes scalability. The techniques in the third section show more ambitious ways of using Intel TBB that can help realize these performance gains.

These examples use a low-level API in Intel TBB, called the task scheduler API. The high-level API in Sample 3 uses this low-level API internally. The task scheduler API allows code to directly manipulate the work trees that Intel TBB uses to represent parallel work. Manipulation of these work trees is necessary when implementing explicit functional parallelism and other techniques that go beyond simple data parallelism.


Figure A: A visual representation of an Intel TBB work tree

Figure A shows a visualization of a work tree being executed by Intel TBB. Each tree has only one root, although Intel TBB can process multiple trees simultaneously. Execution starts with the call to spawn the root, then there is a wait for execution of it to complete. The root of this tree creates one child task. This task can call arbitrary pre-processing code before optionally creating and executing more children and finally calling post-processing code. When the child task completes, control passes back to the root, which also completes, and the original wait is finally over. Diagrams of this type will be used to illustrate the techniques in the following samples.

Intel TBB is utilized more heavily in the following examples, but this does not imply that an existing threading API in a game must change to reflect the paradigm of the task scheduler API. In most cases it is possible to plumb the task scheduler API in underneath an existing API. The following examples show how Intel TBB can support some threading paradigms commonly used in games.

 
Article Start Previous Page 2 of 5 Next
 
Comments

none
 
Comment:
 


Submit Comment