Synchronized callbacks
Synchronized callbacks
solve a specific and important problem in game development: coordination with
threaded middleware. Modern game development has matured to a point where many
of the common features in a game can be supplied by third-party middleware.
There are multiple solutions available for physics, AI, particles, and
animation. Many of these middleware
solutions have already transitioned to supporting enhanced performance on
multi-core processors.
In most cases, each middleware solution assumes that it
has access to most or all of the computational power available. This can lead
to poor performance when multiple middleware solutions are used or when a
game's core computation is also attempting to access the additional performance
potential of multi-core hardware.
The ideal solution to this problem is to have all
computational activity use the same optimal set of threads. Some middleware is
designed to support this approach, but requires that each thread be specially
initialized to process each of the middleware's computational tasks.
Synchronized callbacks handle this need for initialization by calling a
specified function from all threads before computation continues.
Synchronized callbacks are trivial to implement in threading
architectures in which the threads are directly accessible. Intel TBB does not
expose its threads, so if a game is using Intel TBB to provide a high
performance threading architecture, additional steps must be taken to implement
this technique as detailed below.
Figure D: Synchronized callbacks wait until all threads have called.
Figure D shows a work tree that implements a synchronized
callback scheme. A root is spawned and waited on, and that root has a number of
tasks equal to the number of threads in the thread pool. Each task calls the
callback and then tests to see if all the callbacks have been called. If so,
execution of the task ends. If not, then the task waits for all callbacks to
finish.
Sample 6(a): Implementing a synchronized callback
void doSynchronizedCallback(FunctionPointer
fCallback, void *pParam, int iThreads)
{
tbb::atomic<int> tAtomicCount;
tAtomicCount =
iThreads;
// allocation with
"placement new" syntax, see TBB reference documents
tbb::task
*pRootTask = new(tbb::task::allocate_root())
tbb::empty_task;
tbb::task_list
tList;
for(int i = 0; i <
iThreads; i++)
{
tbb::task
*pSynchronizeTask = new(pRootTask->allocate_child())
SynchronizeTask(fCallback, pParam, &tAtomicCount);
tList.push_back(*pSynchronizeTask);
}
pRootTask->set_ref_count(iThreads + 1);
pRootTask->spawn_and_wait_for_all(tList);
pRootTask->destroy(*pRootTask);
}
Sample 6(a) shows how the work tree of Figure D is created.
The code to construct the work tree is similar to what was used in the promise
pattern. Just as in the promise pattern, the root is never spawned and must be
manually destroyed once its work is done. The SynchronizeTasks need more than
just a function pointer and a parameter; they need a shared atomic count for
the Test + Wait in the work tree.
Sample 6(b): SynchronizeTask does the test and wait
tbb::task *execute()
{
assert(m_fCallback);
m_fCallback(m_pParam);
m_pAtomicCount->fetch_and_decrement();
while(*m_pAtomicCount > 0)
{
// yield while
waiting
tbb::this_tbb_thread::yield();
}
return NULL;
}
Sample 6(b) shows how a SynchronizeTask tests and waits
after calling its callback. Each task decrements the atomic count. Since this
count was initialized with the total number of threads, the atomic count will
reach zero when the last thread has called the callback. The code checks the
value of the atomic count, and if it is greater than zero, the task spins,
waiting for other SynchronizeTasks to reduce the count to zero. This spinning
is appropriate since, by definition, the thread running this task has nothing
else to do until all SynchronizeTasks have run.
Synchronized callbacks become increasingly important as
middleware matures to accommodate threaded game architectures. A game
architecture based on Intel TBB for performance reasons can take full advantage
of threaded middleware using this technique.
|