Contents
Sponsored Feature: Optimizing Game Architectures with Intel Threading Building Blocks
 
 
Printer-Friendly VersionPrinter-Friendly Version
 


Part of:



[More information...]
 

Latest News
spacer View All spacer
 
November 22, 2009
 
Video Game Watchdog National Institute On Media And The Family Shutting Down [11]
 
Modern Warfare 2 Infinity Ward's 'Most Successful PC Version' Yet [14]
 
New Tech, Design Details Of Project Natal To Emerge At Gamefest In February
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
November 22, 2009
 
Trion Redwood City
Sr. Environment Artist
 
Trion Redwood City
Sr. Evnironment Modeler
 
Sucker Punch Productions
Network Programmer
 
Sucker Punch Productions
Texture Artist
 
Sucker Punch Productions
Character Artist
 
Sucker Punch Productions
3D Environment Artist
 
Crystal Dynamics
Sr. Level Designer
 
Sony Online Entertainment
Brand Manager
spacer
Latest Features
spacer View All spacer
 
November 22, 2009
 
arrow Upping The Craft: Susan O'Connor On Games Writing [6]
 
arrow Small Developers: Minimizing Risks in Large Productions - Part II [7]
 
arrow iPhone Piracy: The Inside Story [50]
 
arrow And Yet It Grows: Analyzing the Size and Growth of the European Game Market [5]
 
arrow NPD: Behind the Numbers, October 2009 [13]
 
arrow Reflecting On Uncharted 2: How They Did It [5]
 
arrow Sponsored Feature: Rasterization on Larrabee -- Adaptive Rasterization Helps Boost Efficiency
 
arrow Postmortem: Wadjet Eye's The Blackwell Convergence [2]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
November 22, 2009
 
Time Fcuk - A Postmortem [2]
 
Accepting the Inherent Value of Games
 
Planckogenesis, Part II: Song Structure & Gravy Train [1]
spacer
About
spacer News Director:
Leigh Alexander
Features Director:
Christian Nutt
Editor At Large:
Chris Remo
Advertising:
John 'Malik' Watson
Recruitment/Education:
Gina Gross
 
Features
  Sponsored Feature: Optimizing Game Architectures with Intel Threading Building Blocks
by Brad Werth
0 comments
Share RSS
 
 
March 30, 2009 Article Start Previous Page 3 of 5 Next
 

Callbacks

One of the basic needs of a threading infrastructure is the ability to dispatch work to a thread running parallel to the main thread. A callback system (sometimes called a job system) is one way to achieve this.

Callbacks are function pointers executed by a thread in a thread pool. The dispatching thread maintains no connection to the callback and does not explicitly wait for its completion. Instead, the callback sets a flag to be checked later in the dispatching thread's execution.

Callback systems using native threading solutions often implement this technique by putting the function pointer and parameter into a queue, which is serviced by threads in the thread pool whenever they complete a task.

This approach can lead to performance degradation in highly contested situations, as demonstrated by Sample 1. Using a concurrent container does not completely avoid this problem, since worker threads may still wait to get a task popped off the queue. Intel TBB avoids this performance bottleneck by maintaining a task queue per thread and implementing an efficient task-stealing system to spread the work around.


Figure B: Callback is spawned and not waited on

Figure B shows an Intel TBB callback system that attaches a task to a pre-existing root, which hosts multiple callback tasks at once. This allows the callbacks to be waited on as a group, which is helpful for proper cleanup on exit. In this tree, the child task is spawned directly since there is no wait, and work tree roots must be waited on when they are the target of a spawn. This Intel TBB approach allows an arbitrary number of callbacks to be in flight at once and guaranteed complete on exit, much like a work queue provides in the traditional approach.

Intel TBB processes the work tree in Figure B in the following way: Execution begins with the Task, which is the parameter of the call to spawn(). The Task calls the specified callback function, which may trigger the creation of additional subtasks, represented by More. If subtasks are created, those subtasks are processed next. The net effect of this execution is that the callback function is called asynchronously, and it has an opportunity to spawn additional work, which will also be executed asynchronously. Presumably the final step in the callback is to indicate to the spawning thread that the computation is complete, although this notification step is not enforced.

Sample 4: Creating a work tree for a callback

void doCallback(FunctionPointer fCallback, void *pParam)
{
// allocation with "placement new" syntax, see TBB reference documents
CallbackTask *pCallbackTask = new(
s_pCallbackRoot->allocate_additional_child_of(*s_pCallbackRoot)
) CallbackTask(fCallback, pParam);

s_pCallbackRoot->spawn(*pCallbackTask);
}

Sample 4 shows the steps to create the work tree in Figure B. Instead of creating a root for each callback task, this code uses the same root for all callbacks. Again, it is the callback function's responsibility to create additional subtasks and to notify the calling thread when the calculation is complete.

Callbacks meet a basic need of game architectures: how to run multiple functional operations simultaneously without introducing contention for hardware resources. Intel TBB enhances this solution further by providing a clear path to combine callbacks with loop parallelism.

Promises

In many cases, it is advantageous to keep track of asynchronous calls and to explicitly wait for them to complete. Promises are one way to accomplish this. When handling a request for asynchronous execution, a promise system provides to the calling thread an object that acts as a link to the asynchronous call. This promise object allows the calling thread to wait on the asynchronous call when necessary and to actively contribute to the calculation if the call introduced additional parallel work.


Figure C: Promise system keeps track of the asynchronous call.

Figure C shows that the work tree for a promise system is fundamentally similar to the work tree for a callback system, shown in Figure B. The primary difference is the system waits on the root of the work tree. Another difference is the root of the tree is encapsulated inside a promise object. The root is not executed immediately after the task completes, but only when the promise explicitly waits for it. This prevents the root from disappearing without the promise knowing about it.

Sample 5(a): Implementing a promise

void doPromise(FunctionPointer fCallback, void *pParam, Promise *pPromise)
{
// allocation with "placement new" syntax, see TBB reference documents
tbb::task *pParentTask = new(tbb::task::allocate_root()) tbb::empty_task();
assert(pPromise != NULL);
pPromise->setRoot(pParentTask);
PromiseTask *pPromiseTask = new(pParentTask->allocate_child()) PromiseTask(fCallback, pParam, pPromise);

// set the ref count to 2, which accounts for the 1 child and for the eventual wait_for_all
pParentTask->set_ref_count(2);
pParentTask->spawn(*pPromiseTask);
}

Sample 5(a) shows how the work tree in Figure C is created. The primary difference between this code and the callback code in Sample 4 is that a root is created, it is wrapped in an object (pPromise above), and the reference count of the task is increased to ensure that the root will not run until requested by the promise. This avoids the specific problem of having the root execute asynchronously and then get deleted and reused by the Intel TBB memory manager as an unrelated task in the tree.

Sample 5(b): Waiting on a promise

void waitUntilDone()
{
if(m_pRoot != NULL)
{
// lock up access to one-at-a-time
tbb::spin_mutex::scoped_lock(m_tMutex);
if(m_pRoot != NULL)
{
m_pRoot->wait_for_all();
m_pRoot->destroy(*m_pRoot);
m_pRoot = NULL;
}
}
}

Sample 5(b) shows how the promise object ensures the completion of the asynchronous call. A mutex is introduced here to ensure that the waitUntilDone call is thread-safe. If waitUntilDone will be called by only one thread (presumably the thread that submitted the asynchronous call in the first place), this additional mutex is not necessary. The root is never spawned because of the increased reference count used in the call to doPromise. Since the root is never spawned, it is never executed and therefore never deleted by Intel TBB. This code deletes the root explicitly once the asynchronous call is complete.

Promises provide a comprehensive scheme for dispatching and waiting on parallel work. An Intel TBB implementation of promises has the additional benefit of providing an efficient work-while-waiting policy. These features make promises broadly applicable to game architectures.

 
Article Start Previous Page 3 of 5 Next
 
Comments

none
 
Comment:
 


Submit Comment