Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Gamasutra: The Art & Business of Making Gamesspacer
Sponsored Feature: Who Moved the Goal Posts? The Rapidly Changing World of CPUs
View All     RSS
October 20, 2020
arrowPress Releases
October 20, 2020
Games Press
View All     RSS







If you enjoy reading this site, you might also want to check out these UBM Tech sites:


 

Sponsored Feature: Who Moved the Goal Posts? The Rapidly Changing World of CPUs


October 19, 2009 Article Start Previous Page 5 of 7 Next
 

Cache-aware task queues

One approach that is gaining favor is to build individual task queues per thread (Figure 12) thereby limited the synchronization points between threads and implementing a task-stealing approach similar to that used in the Intel Threading Building Blocks.

Instead of placing procedurally generated tasks on the end of a single queue, this method uses a last-in-first-out order in which the new task is pushed to the front of the same queue its parent task came from. Now the developer can make good use of the caches because there is a high likelihood that the parent task has warmed the cache with most of the data the thread needs to complete the new task.

The individual queues also mean tasks can be grouped by the data they share.



Figure 12a. Per-thread Figure 12a. thread independently pull data from queues



Figure 12c. LIFO for procedural tasks Figure 12d. task used hot cache

Taking a task from another queue occurs only when a thread-specific queue is empty. Although this is not the most efficient use of the cache, compared to sitting idle it may be the lesser of the two evils and provides a more balanced approach.


Figure 12e. thread task steal when queue empty

In addition to an intelligent task queue the algorithms themselves that are being parallelized may also need to be aware of the under lying hardware, intelligent use of task assignment and grouping of when jobs get executed can greatly improve cache utilization and therefore overall performance, the best optimization tool is always the programmers brain.


Article Start Previous Page 5 of 7 Next

Related Jobs

Remedy Entertainment
Remedy Entertainment — Espoo, Finland
[10.20.20]

Senior Material Artist
Remedy Entertainment
Remedy Entertainment — Espoo, Finland
[10.20.20]

Development Manager (Xdev Team)
Remedy Entertainment
Remedy Entertainment — Espoo, Finland
[10.20.20]

Animation Director
innogames
innogames — Hamburg, Germany
[10.20.20]

Mobile Software Developer (C++) - Video Game: Forge of Empires





Loading Comments

loader image