Gamasutra: The Art & Business of Making Gamesspacer
arrowPress Releases
July 23, 2014
PR Newswire
View All





If you enjoy reading this site, you might also want to check out these UBM Tech sites:


 
Sponsored Feature: First Look at Sandy Bridge: Integrating Graphics into the CPU

October 12, 2010
 

[In his latest Intel-sponsored Gamasutra column, Orion Granatir offers the key highlights of Intel's entry into the processor-graphics arena: Sandy Bridge (codename for the recently announced 2nd Generation Intel Core processors).

New opportunity is knocking, and its name is Processor Graphics: the integration of graphics functionality into the CPU. Processor graphics will soon be found on computers everywhere, meaning expanded markets for your next-gen video games.

Topics for this article include architecture overview, performance benefits, and tips on what developers should consider when optimizing their games for processor graphics-based machines.]


Processor graphics will soon be found in computers everywhere. Intel calls this new capability Intel HD Graphics; at AMD, it's called AMD Fusion. Both names refer to the integration of graphics functionality into the CPU.

At this year's Intel Developers Forum, the 2nd Generation Intel Core processors (codenamed "Sandy Bridge") were announced. One key feature of this architecture is this tighter integration of graphics into the processor.

One advantage of working at Intel is having early access to leading-edge technology. I'm writing this column on my Sandy Bridge machine. Right now it may be another big dev box on my desk, but this chip will be in mainstream laptops in 2011.

I can't share any numbers, but I can tell you that the processor graphics are pretty good. It runs one of my favorite games, Bioware's Mass Effect 2, at a solid fps in glorious detail (by the way, my mission was truly suicidal).

The ring architecture that connects the processor cores (x86 cores) together is now connected to the processor graphics. This ring interconnect enables high-speed and low-latency communication between the processors cores, processor graphics, and other integrated components, such as the memory controller and the display.

Basically, the processor cores and graphics engine communicate through a shared cache, creating some interesting possibilities for tight CPU/GPU interaction. We'll explore some of these topics in future articles.

The integration of process components also provides some new improvements to Intel Turbo Boost Technology. Intel Turbo Boost Technology enables the processor to adjust the processor core and processor graphics frequencies to increase performance and maintain the allotted power/thermal budget. This means the processor can increase individual core speed or graphics speed as the workload dictates.


Power Load Line

To learn more about Intel Turbo Boost Technology, check out the white paper: Intel HD Graphics Dynamic Frequency Technology. (And if you want to get this type of content delivered regularly, sign up: Intel Software Dispatch for Visual Adrenaline.)

During performance analysis, it's important to pay attention to dynamic frequencies. A typical game will load down the CPU and GPU with the processor finding a good balance. However, if you are playing back a captured frame, the CPU workload might not be as high and the graphics dynamic frequency might affect your results. This should be a relative scaling within the frame, but the time to complete the frame might be unexpected.

Since Intel Turbo Boost Technology is automatically controlled by the CPU, a developer cannot directly control it. However, it's important to understand how it works. Most games I have investigated benefit well from the graphics turbo scaling.

The addition of Intel Advanced Vector Extensions (Intel AVX) is another interesting Sandy Bridge feature. AVX extends SIMD (single instruction multiple data) instructions from 128 bits to 256 bits. For applications that are floating point intensive, AVX enables a single instruction to work on eight floating points at a time instead of the four that the current SIMD provides.

It's important to note other hardware vendors have also announced support for AVX.

Most developers will just use the latest Microsoft Visual Studio compiler or Intel's C/C++ Compiler to take advantage of AVX. But, for the clock counting, bit shifting, tech heads out there, you can learn more at the Intel's AVX website (it even includes an emulator).

The best way to work with AVX is through intrinsics, which are supported by both the Microsoft and Intel compilers. Anyone familiar with programming SSE or the PlayStation 3's SPUs will be good "frenemies" with intrinsics. Intrinsics are compiler specific functions that usually compile down to highly efficient inlined machine instructions.

Since the compiler has a strong understanding of intrinsics, it will often generate code faster than inlined assembly. Intrinsics are the best way to write high-throughput, compute-intensive code on the CPU (intrinsics are an acquired taste, like wine or Remedy Entertainment's Alan Wake).

Intel engineers and performance-hungry developers are already exploring ways AVX can benefit game and graphics applications. For example, my coworker Stan Melax, just wrote a great article presenting a programming pattern to improve the performance of geometry computations by transposing packed 3D data on-the-fly.

AVX is interesting, but let's shift our focus to the graphics engine.

Remember, Sandy Bridge is a DirectX 10.1 part. I like the DX11 multithreaded API, so most of my current code is DX11 with the proper DX10.1 "feature level" set for Sandy Bridge. With full DX10.1 support, there are no real major surprises when programming for Intel graphics. However, there are a few things to keep in mind.

The memory layout for processor graphics is different than it is for a discrete card. Graphics applications often check for the amount of available free video memory early in execution. As a result of the dynamic allocation of graphics memory performed by the Intel HD Graphics devices (based on application requests), you need to know the total amount of memory that is truly available to the graphics device. Memory checks that supply only the amount of "local" or "dedicated" graphics memory available do not supply an appropriate value for the Intel HD Graphics devices.

All video memory on Intel HD Graphics and earlier generations, including Intel Graphics Media Accelerator Series 3 and 4, use Dynamic Video Memory Technology (DVMT). DVMT memory is considered "local memory." "Non-local video memory" will show as ZERO (0). This should not be used to determine compatibility with Accelerated Graphics Port (AGP) or PCI Express.

To accurately detect the amount of memory available, you'll need to check the total video memory availability. The Microsoft DirectX SDK (June 2010) includes the VideoMemory sample code and describes five commonly used methods to detect the total amount of video memory.

Applications targeting Microsoft Windows Vista and Microsoft Windows 7, should reference GetVideoMemoryViaDXGI. For Microsoft Windows XP applications, GetVideoMemoryViaWMI is a good starting place. For more information, visit the Microsoft sample code site.

The best place to get started with Intel processor graphics is to check out the Intel Graphics Developer's Guides. Even though the guides don't yet discuss specifics for Sandy Bridge, you'll find this information useful in learning about Intel Graphics. Once the product introduction is complete for Sandy Bridge, we'll post an updated guide.

With nearly a million PCs shipped each day, the available market of processor graphics is growing quickly. It's worthwhile to understand and validate on processor graphics. Soon, processor graphics will be everywhere.

[You can check out an archive of Orion Granatir's continuing Gamasutra sponsored column series via his Gamasutra biography page.]


Related Jobs

Xaviant
Xaviant — Cumming, Georgia, United States
[07.23.14]

Senior Quality Assurance Analyst
InnoGames GmbH
InnoGames GmbH — Hamburg, Germany
[07.23.14]

Software Developer PHP (m/f)
Sony Online Entertainment, San Diego
Sony Online Entertainment, San Diego — San Diego, California, United States
[07.22.14]

Intermediate and Senior Database Tools Programmers
2K
2K — Novato, California, United States
[07.22.14]

Senior Engine Programmer






Comments


Mason Bially
profile image
So does this mean Intel is finally going to support OpenGL 3.x on it's new graphics chips/drivers? Or is this a DirectX only chip/driver expansion.



I ask because we develop cross platform applications, and OpenGL 3.0 is what we like to use.



Other than that, sounds cool!

Orion Granatir
profile image
Sandy Bridge will support OpenGL 3.0. The current developers' guide focuses on DirectX, but we are working on adding more OpenGL info.

Daniel Kinkaid
profile image
I continue to find this an odd move. Granted, from a platform perspective, this saves intel a lot of money, as it gives them a low cost platform. On the other, graphical rendering is simply far more efficent using a fully parallel processor like a GPU. Intels failed attempt at Larrabee alone is proof of how difficult it is to get decent performance, even using many scalar processors wired together in parallel.

Orion Granatir
profile image
It’s worth looking at the dev guide to see how the processor graphics is put together. At the top of section 2.1, there is a block diagram of the architecture. The Executions Units (EUs) are executing most of the interesting work (shaders, etc) and are hardware built for optimum throughput of graphic workloads. This is a new execution pipeline that works in tandem with the processing cores through shared resources (e.g. the cache). Once Sandy Bridge has launched, a new guide will be publically available, but the current guides gives enough details to understand what’s going on.

Tim Tavernier
profile image
I once said to my friend that AMD probably bought ATI to aid into making a multi-core CPU with one of the cores being or replacing a GPU but that this project back-fired, resulting in the slump of AMD-processor competitiveness compared to intell.



I never knew that I was actually this close to the actual truth. Not entirely, but certainly in the right direction.


none
 
Comment: