Contents
Sponsored Feature: Designing the Framework of a Parallel Game Engine
 
 
Printer-Friendly VersionPrinter-Friendly Version
 


Part of:



[More information...]
 

Latest News
spacer View All spacer
 
February 9, 2010
 
Dragon Age: Origins Sells In 3.2 Million Units [2]
 
Interview: Composer Garry Schyman Talks BioShock Soundtracks
 
Adult Swim: Don't 'Hamstring' Developers With Existing IP
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
February 9, 2010
 
Tarsier Studios
Senior Game Engine Programmer
 
Konami Digital Entertainment Co., Ltd.
Sound designer
 
Vicarious Visions / Activision
Tools Programmer
 
Trion Redwood City
Sr. Environment Modeler
 
Trion Redwood City
Sr. Graphics Programmer
 
Sparkplay Media
Senior Game Designer
 
Blue Fang Games
Producer: Online/Mobile
 
Blue Fang Games
Senior Server Engineer: Online/Mobile
spacer
Latest Features
spacer View All spacer
 
February 9, 2010
 
arrow Television, Meet Games
 
arrow Two Halves, Together: Patrick Gilmore On Double Helix [1]
 
arrow The Road To Hell: The Creative Direction of Dante's Inferno [18]
 
arrow The Sensible Side of Immersion [10]
 
arrow Jumpstarting Your Creativity [5]
 
arrow Truth in Game Design [49]
 
arrow Postmortem: Vicious Cycle's Matt Hazard: Blood Bath and Beyond [4]
 
arrow Developers React: The iPad's Future [16]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
February 9, 2010
 
[Lineage 2] Freya Update Is Just a Beginning ②
 
Swashbuckling for Landlubbers: Why you may already be encouraging piracy! [9]
 
[Lineage 2] Freya Update Is Just a Beginning ①
spacer
About
spacer News Director:
Leigh Alexander
Features Director:
Christian Nutt
Editor At Large:
Chris Remo
Advertising:
John 'Malik' Watson
Recruitment/Education:
Gina Gross
 
Feature Submissions
Features
  Sponsored Feature: Designing the Framework of a Parallel Game Engine
by Jeff Andrews
2 comments
Share RSS
 
 
February 25, 2009 Article Start Page 1 of 6 Next
 

[Single-threaded game engines can still work, but they are becoming increasingly outclassed by multithreaded solutions that are more sophisticated, but also more complex to create and optimize. In this sponsored article, part of the Intel Visual Computing Microsite, Intel application engineer Jeff Andrews lays that process bare.]

With the advent of multiple cores within a processor, the need to create a parallel game engine has become increasingly important. Although it is still possible to focus primarily on only the GPU and have a single-threaded game engine, the advantages of using all the processors on a system, whether CPU or GPU, can give the user a much greater experience. For example, by using more CPU cores, a game can increase the number of rigid body physics objects for greater effects on the screen. Optimized games might also yield a smarter AI.

This white paper covers the basic methods for creating and optimizing a multi-core game engine. It also describes the state manager and messaging mechanism that keeps data in sync and offers several tips for optimizing parallel execution. Multiple block diagrams are used to depict theoretical overviews for managing tasks and states, interfacing with scenes, objects, and tasks, and then initializing, loading, and looping within synchronized threads.

Introduction

The Parallel Game Engine Framework or engine is a multi-threaded game engine that is designed to scale to all available processors within a platform. To do this, the engine executes different functional blocks in parallel so that it can use all available processors. However, this is often easier said than done, because many pieces in a game engine often interact with one another and can cause many threading errors. The engine takes these scenarios into account and has mechanisms for getting the proper synchronization of data without being bound by synchronization locks. The engine also has a method for executing data synchronization in parallel to keep serial execution time to a minimum. 

Parallel Execution State

The concept of a parallel execution state in an engine is crucial to an efficient multi-threaded runtime. For a game engine to truly run in parallel, with as little synchronization overhead as possible, each system must operate within its own execution state and interact minimally with anything else going on in the engine. Data still needs to be shared, but now instead of each system accessing a common data location to get position or orientation data, for example, each system has its own copy-removing the data dependency that exists between different parts of the engine. If a system makes any changes to the shared data, notices are sent to a state manager, which in turn queues the changes, called messaging. Once the different systems have finished executing, they are notified of the state changes and update their internal data structures, which is also part of messaging. Using this mechanism greatly reduces synchronization overhead, allowing systems to act more independently.

Execution Modes

Execution state management works best when operations are synchronized to a clock, which means the different systems execute synchronously. The clock frequency may or may not be equivalent to a frame time, and it is not necessary for it to be so. The clock time does not even have to be fixed to a specific frequency, but instead can be tied to frame count, such that one clock step is equal to how long it takes to complete one frame, regardless of length. The implementation one chooses to use for the execution state will determine the clock time. Figure 1 shows the different systems operating in the free step mode of execution, which means they don't have to complete their execution on the same clock. There is also a lock step mode of execution (Figure 2) in which all systems complete in one clock. The main difference between the two modes is that free step provides flexibility in exchange for simplicity, while lock step is the reverse.


Figure 1. Execution state using the free step mode.

Free Step Mode

This execution mode allows a system to operate in the time it needs to complete its calculations. "Free" can be misleading, because a system is not free to complete whenever it wants, but is free to select the number of clocks it needs to execute. With this method, a simple notification of a state change to the state manager is not enough. Data must also be passed along with the state-change notification because a system that has modified shared data may still be executing when another system that wants the data is ready to do an update. This requires the use of more memory and copies, so it may not be the most ideal mode for all situations. Given the extra memory operations, free step might be slower than lock step, although this is not necessarily true.

Lock Step Mode

With this mode all systems must complete their execution in a single clock. Lock step mode is simpler to implement and does not require data to be passed with the notification because systems that are interested in a change made by another system can simply query that system for the value (at the end of execution).


Figure 2. Execution state using the lock step mode. 

The lock step mode can also implement a pseudo-free-step mode of operation by staggering calculations across multiple steps. One use for this might be with an AI that calculates its initial "large view" goal in the first clock, then instead of just repeating the goal calculation for the next clock comes up with a more focused goal based on the initial goal.

Data Synchronization

It is possible for multiple systems to make changes to the same shared data. To do this, the messaging needs some sort of mechanism that can determine the correct value to use. Two such mechanisms can be used:

  • Time. The last system to make the change time-wise has the correct value.

  • Priority. A system with a higher priority is the one that has the correct value. This can also be combined with the time mechanism to resolve changes from systems of equal priority.

Data values that are determined to be stale, via either mechanism, will simply be overwritten or thrown out of the change notification queue.

Using relative values for the shared data can be difficult because some data may be order-dependent. To alleviate this problem, use absolute data values so that when systems update their local values they replace the old with the new. A combination of both absolute and relative data is ideal, although its use depends on each specific situation. For example, common data, such as position and orientation, should be kept absolute because creating a transformation matrix for it depends on the order in which the data are received. A custom system that generated particles, via the graphics system, and fully owned the particle information could merely send relative value updates.

 
Article Start Page 1 of 6 Next
 
Comments

Raoul Duke
profile image
Nice article!

Reminds me of Erlang.

SONG Vincent
profile image
"significant performance gains on modern and future processors"
Can you post some test results ?
Corei7 performance scaling will probably be decent (around 4x I bet) :p
But did you have the opportunity to tests on a xeon 7400 (6 cores) or on a dualxeon (12 cores) ?

Vincent


none
 
Comment:
 


Submit Comment