Designing systems for functional decomposition, coupled with data decomposition, will deliver a good amount of parallelization and also ensure scalability with future processors that have an even larger number of cores. Remember to use the state manager along with the messaging mechanism to keep the data in sync with only minimal synchronization overhead.
The observer design pattern is a function of the messaging mechanism, and some time should be spent learning it. With experience, the most efficient design possible can be implemented to address the engine's needs. After all, the observer design pattern is the mechanism of communication between the different systems to synchronize all shared data.
Tasking plays an important role in proper load balancing. Follow the tips in Appendix D to create an efficient task manager for your engine.
Designing a highly parallel engine is a manageable task if you use clearly defined messaging and structure. Properly building parallelism into your game engine will give it significant performance gains on modern and future processors.
Example of an Engine Diagram
This diagram gives an example of how the different systems are connected to the engine. All communication between the engine and the systems goes through a common interface. Systems are loaded via the platform manager (not shown).
The engine manager and system initializations.
Engine and System Relationship Diagram
The Observer Design Pattern
The observer design pattern is documented in Design Patterns: Elements of Reusable Object-Oriented Software.
With this pattern, any items interested in data or state changes in other items do not have to poll the items from time to time to see if there are any changes. The pattern (Figure 13) defines a subject and an observer that are used for the change notification-the observer observes a subject for any changes. The change controller acts as a mediator between the two.
Figure 13. The observer design pattern.
1. The observer, via the change controller, registers itself with the subject for which it wants to observe changes.
2. The change controller is actually an observer. Instead of registering the observer with the subject it registers itself with the subject and keeps its own list of which observers are registered with which subject.
3. The subject inserts the observer (actually the change controller) in its list of observers that are interested in it; optionally there can also be a change type that identifies what type of changes the observer is interested in-this helps speed up the change notification distribution process.
4. When the subject makes a change to its data or state it notifies the observer via a callback mechanism and passes information of the types that were changed.
5. The change controller queues the change notifications and waits for the signal to distribute them.
6. During distribution the change controller calls the actual observers.
7. The observers query the subject for the changed data or state (or get the data from the message).
8. When the observer is no longer interested in the subject or is being destroyed, it deregisters itself from the subject via the change controller.
Tips on Implementing Tasks
Although task distribution can be implemented in many ways, try to keep the number of worker threads equal to the number of available logical processors of the platform. Avoid setting the affinity of tasks to a specific thread, because the tasks from the different systems will not complete at the same time. Specific affinities can lead to a load imbalance among the worker threads, effectively reducing your parallelization. Also, consider using a tasking library, such as Intel® Threading Building Blocks, which can simplify task distribution. Two optimizations can be done in the task manager to ensure CPU-friendly execution of the different task submitted.
Reverse Issuing. If the order of primary tasks being issued is fairly static, the tasks can be alternately issued in reverse order from frame to frame. The last task to execute in a previous frame will more than likely still have its data in the cache, so issuing the tasks in reverse order for the next frame will all but guarantee that the CPU caches will not have to be repopulated with the correct data.
Cache Sharing. Some multi-core processors have their shared cache split into sections, so that two processors may share a cache, while another two share a separate cache. Issuing sub-tasks from the same system onto processors sharing a cache increases the likelihood that the data will already be in the shared cache.
Gamma, Erich, Richard Helm, Ralph Johnson, and John M. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. USA: Addison-Wesley, 1994.
Intel® Threading Building Blocks (TBB). Available from: http://www.threadingbuildingblocks.org/
Intel and Gamasutra - Visual Computing. Available from: http://www.gamasutra.com/visualcomputing/
Multi-threaded Game Programming and Hyper-Threading Technology. Available from: http://software.intel.com/en-us/articles/multithreaded-game-programming-and-hyper-threading-technology
Reinders, James. Intel Threading Building Blocks. USA: O'Reilly Media, Inc., 2007.
Smoke - Game Technology Demo. Available from: http://software.intel.com/en-us/articles/smoke-game-technology-demo
Threading Basics for Games. Available from: http://software.intel.com/en-us/articles/threading-basics-for-games
Threading Methodology: Principles and Practice. Available from: http://software.intel.com/en-us/articles/threading-methodology-principles-and-practice