References
A Load-Hit-Store can happen
in code, even when it looks like it shouldn't.
void foo(int &count)
{
count = 0;
for (int i=0;i<100;++i) {
if (Test(i)) {
++count;
}
}
}
That code generates a
Load-Hit-Store. How?
The variable "count"
is memory bound. All writes to it, and in many cases reads, go through memory.
Anytime a variable is memory bound and in a tight loop, it can cause
Load-Hit-Stores. A way of fixing this is similar to the previous code example.
void foo(int &output)
{
int count = 0;
for (int i=0;i<100;++i) {
if (Test(i)) {
++count;
}
}
output = count; // Write the result
}
VectorLoad-Hit-Store
The previous examples
demonstrated how easy it is to cause Load-Hit-Store stalls with floating-point
and integer transactions. The VMX register sets suffer from the same problem.
It's common that some math operations could be done more efficiently in a VMX
operation, but what if it involves non-vector data?
On the Xbox 360, the VMX
register intrinsic __vector4 is mapped onto a structure.
Run-time accessing of the elements of the structure should be discouraged for
the reason below.
XMVECTOR Radius = CalcBounds();
pOut->fRadius = Radius.x;
The
second line creates a Load-Hit-Store because the VMX register is used as a
structure. As a result, the compiler has to write the contents of the entire
register to local memory; then the first element is read with a floating-point
register, and only then is the value written into pOut->fRadius.
Here
is a way to write the same code without incurring the hidden Load-Hit-Store:
XMVECTOR Radius = CalcBounds();
__stvewx(&pout->fRadius,__vspltw(Radius,0),0);
VMX
has the ability to write any specific entry as a single float. The vspltw() operation will copy the
requested entry into a temp vector register and the stvewx() operation will handle the writing the float. Using the
compiler's feature of accessing the value isn't recommended.
|
Example that works:
XMVECTOR Radius = { 1.0f, 2.0f, 3.0f, 4.0f };
float Z;
Z = Radius.z; //LHS
__stvewx(__vspltw(Radius,2), &Z,2); //Avoids LHS.
Cheers,
Will