|
As Gabb and Lake point out, communication between tasks
in this model presents an intriguing problem with timing – a problem
that the other models don't have. Suppose there are three tasks that
work concurrently, an input tasks, a physics task that uses input to
move game objects, and a rendering task that uses physics results to
draw the objects. Optimally the input task would complete just before
the physics task starts, which would complete just before the rendering
task starts. In the worst case scenario the rendering would start just
before the physics task is complete, and the physics task would start
just before the input task is complete. This would result in a
input-to-display time of roughly two times of the optimal scenario, and
the time would fluctuate between the optimal and the worst on each
frame. Gabb and Lake suggest a remedy of calibrating some tasks to run
more often than others, such as having the input task run twice more
often than the physics task. This may help alleviate the problem, but
it will not eliminate it.
Since the
asynchronous model assumes little or no synchronization between the
concurrent tasks, the performance of the model is not limited as much
by the serial parts of the program. Therefore the main performance
limitation comes from the ability to find enough useful parallel tasks.
The thing to keep in mind here is that the tasks should be well
balanced - having one large task and several very small ones could
signify a performance bottleneck.
Since
the asynchronous model relies heavily on tasks not directly connecting
to each other, but on communication using the last available
information, there may be changes needed for current components to
function on this model. At the very least each component needs a thread
safe way to inquire the latest state update. Such changes should be
easy enough to make, and they can even be added as an additional
wrapper layer to the component.
Data parallel model
In addition to finding parallel
tasks, it is possible to find some set of similar data for which to
perform the same tasks in parallel. With game engines, such parallel
data would be the objects in the game. For example, in a flying
simulation, one might decide to divide all of the planes into two
threads. Each thread would handle the simulation of half of the planes
(see Figure 3). Optimally the engine would use as many threads as there
are logical processor cores.

Figure 3. A game loop using the data parallel model. Each object thread simulates a part of the game objects.
An important issue is how to divide
the objects into threads. One thing to consider is that the threads
should be properly balanced, so that each processor core gets used to
full capacity. A second thing to consider is what will happen when two
objects in different threads need to interact. Communication using
synchronization primitives could potentially reduce the amount of
parallelism. Therefore a recommended plan of action is to use message
passing accompanied by using latest known updates as in the
asynchronous model. Communication between threads can be reduced by
grouping objects that are most likely to interact with each other.
Objects are more likely to come into contact with their neighbors, so
one strategy could be to group objects by area.
The
data parallel model has excellent scalability. The amount of object
threads can be automatically set to the amount of cores the system is
running, and the only non-parallelizable parts of the game loop would
be ones that don't directly deal with game objects (Read input and
Render tasks in Figure 3). While the function parallel models can still
get the most out of a few cores, data parallelism is needed to fully
utilize future processors with dozens of cores.
|