[In this Intel-sponsored Gamasutra feature, Josh Doss explains how the major technology company is investigating onloading with regard to video game development - the concept of "increasing the platform performance by using CPU resources to do typical GPU-assigned tasks".]
During the optimization phase of your game's development, it's all about the frame rate. Whether you're increasing visual effects and quality to maximize the user's visual experience or dialing down your GPU demands to increase the responsiveness of the game, frame rate is key.
Onloading is a concept we're currently looking at within Intel as a means to increasing the platform performance by using CPU resources to do typical GPU-assigned tasks. This may be especially useful when targeting platforms with processor graphics as these are typically modest in terms of performance when compared to high end discrete graphics cards.
Several years ago discrete graphics-card vendors began suggesting that developers move typical CPU workloads to the graphics solution [Figure 1][Harris06]. Despite the advances in graphics hardware both in the integrated and discrete space, it's still relatively simple to turn up your effects to become graphics-bound even today.
Post processing effects have a significant impact on the quality of your title and have heavy texture and fill requirements. High resolution displays are the norm with use of multiple displays increasing [Valve11].
Figure 1: Moving CPU workloads to the graphics device
CPUs are increasing in total throughput at an exponential rate. With increases in core count and wider vector units, we can start looking at how the CPU can assist with some typical graphics workloads. Prior to taking part in this exercise, it is important that we first ensure we haven't moved our typical CPU workloads -- such as Physics and AI -- to the GPU.
Take a look at your game when you start the optimization phase and see where the bottlenecks are. If your game scales well across multiple cores and threads and provides the best user experience on the platform, you may not have an opportunity for Onloading. If you're graphics-bound, have your Physics and AI workloads already on the CPU, and are seeing opportunities for higher CPU utilization, CPU Onloading may offer potential gains.
Along with the opportunities outlined above, there are challenges that exist when considering using the CPU to perform some typical GPU-accelerated workloads. Of chief concern are lack of fixed function hardware, such as texture units and a rasterizer.
Workloads requiring heavy rasterization or texture filtering are not the best candidates. In addition, there's currently no mechanism within Microsoft Direct3D* to avoid copying the working buffer which results in significant overhead. We can work around this by double buffering our render loop.
A significant advantage to using the CPU for graphics workloads is that they can be implemented without the constraints and limitations imposed by current graphics APIs. Irregular data structures, the use of desired programming languages along with large caches provide additional opportunities for flexibility and performance.
If we're smart enough to look at balancing the entire platform, [Figure 2], we're also going to need to ensure we utilize our available threads and vector units as much as possible. Fortunately both parallel and vector programming have gained significant traction in recent years, with tools and knowledge pervasive across platforms.
Developers familiar with console programming should be well versed in these concepts. Intel provides several tools to aid developers in utilizing our processors' resources including the Intel® VTune™ Amplifier XE.
Figure 2: Leverage the Platform Holistically with CPU Onloading