Gamasutra: The Art & Business of Making Gamesspacer
Sponsored Feature: Common Performance Issues in Game Programming
View All     RSS
May 23, 2017
arrowPress Releases
May 23, 2017
Games Press
View All     RSS






If you enjoy reading this site, you might also want to check out these UBM Tech sites:


 
Sponsored Feature: Common Performance Issues in Game Programming

June 18, 2008 Article Start Previous Page 2 of 3 Next
 

References

A Load-Hit-Store can happen in code, even when it looks like it shouldn't.

void foo(int &count)
{
count = 0;
for (int i=0;i<100;++i) {
if (Test(i)) {
++count;
}
}
}

That code generates a Load-Hit-Store. How?

The variable "count" is memory bound. All writes to it, and in many cases reads, go through memory. Anytime a variable is memory bound and in a tight loop, it can cause Load-Hit-Stores. A way of fixing this is similar to the previous code example.

void foo(int &output)
{
int count = 0;
for (int i=0;i<100;++i) {
if (Test(i)) {
++count;
}
}
output = count; // Write the result
}

VectorLoad-Hit-Store

The previous examples demonstrated how easy it is to cause Load-Hit-Store stalls with floating-point and integer transactions. The VMX register sets suffer from the same problem. It's common that some math operations could be done more efficiently in a VMX operation, but what if it involves non-vector data?

On the Xbox 360, the VMX register intrinsic __vector4 is mapped onto a structure. Run-time accessing of the elements of the structure should be discouraged for the reason below.

XMVECTOR Radius = CalcBounds();
pOut->fRadius = Radius.x;

The second line creates a Load-Hit-Store because the VMX register is used as a structure. As a result, the compiler has to write the contents of the entire register to local memory; then the first element is read with a floating-point register, and only then is the value written into pOut->fRadius.

Here is a way to write the same code without incurring the hidden Load-Hit-Store:

XMVECTOR Radius = CalcBounds();
__stvewx(&pout->fRadius,__vspltw(Radius,0),0);

VMX has the ability to write any specific entry as a single float. The vspltw() operation will copy the requested entry into a temp vector register and the stvewx() operation will handle the writing the float. Using the compiler's feature of accessing the value isn't recommended.


Article Start Previous Page 2 of 3 Next

Related Jobs

Square Enix Co., Ltd.
Square Enix Co., Ltd. — Tokyo, Japan
[05.23.17]

Experienced Game Developer
Big Red Button Entertainment
Big Red Button Entertainment — El Segundo, California, United States
[05.22.17]

Jr./Mid-Level Environment Artist
Telltale Games
Telltale Games — San Rafael, California, United States
[05.22.17]

Brand Manager
Telltale Games
Telltale Games — San Rafael, California, United States
[05.22.17]

Senior UI Designer





Loading Comments

loader image