CMP Game Media Group Presents: Home
  JoinHelpContact UsShop

Newswire
Features
Connection
Job Search
Directories
By Rob Wyatt
Gamasutra
May 28, 1999
Vol. 3: Issue 21

Features
Wyatt's World

Cracking Open The Pentium III

Contents

What is all the fuss about?

How do I detect the new instructions?

What operating system support is required for the Pentium III?

What are these new SIMD instructions?

How do I make use of the new instructions?

How do I debug code with the new instructions?

How do I read the new Pentium III serial number?

Is there any new performance/ profiling information?

What operating system support is required for the Pentium III?

Both MMX and 3DNow! were aliased onto the floating-point registers so that no additional processor state had to be introduced. This meant all existing operating systems could continue to work without modification. However, the eight new extended multimedia registers found in the Pentium III add state to the processor, and the operating system must be aware of this when switching tasks. To help, Intel added two new instructions, FXSAVE and FXRSTOR, which save and restore a whopping 512 bytes of state. Within these 512 bytes are the SIMD registers, the floating point/MMX registers and various control registers. These new instructions actually first appeared in the late-model Pentium II processors, and their presence is indicated by the CPUID instruction and by checking the FXSR bit in the features.

So, all we need now is an operating system that is aware of the new "save" and "restore state" instructions. Fortunately, Windows 95 OSR2, Windows 98, Windows NT 4.0 with Service Pack 4 and Windows NT 5.0 (beta 2) are aware of and use these new state instructions; support problems should be minimal. Windows NT 4.0 requires a special driver in addition to Service Pack 4, which is available at http://developer.intel.com. Service Pack 4 is available at either http://www.microsoft.com or through MSDN.

The method of detecting operating system support for Streaming SIMD Instructions is quite simple. If the operating system supports the new state management instructions, it sets the OSFXSR bit (bit 9) in control register 4. While this might seem to be an ideal way to detect the necessary operating system support, the problem is that control registers are out of bounds for general applications – they can only be accessed from Ring 0. So how do you detect operating system support? Dropping into Numega’s SoftICE and executing the CPU command will verify the CR4 setting on your development system. Assuming you have already detected the presence of the SIMD instructions on the chip, if the OSFXSR bit is not set, the SIMD instructions will generate invalid opcode exceptions, and that will alert you to the fact that the operating system lacks support.

The subject of exceptions brings me to the last operating system support issue. Just like floating point instructions, the SIMD instructions can generate exceptions for cases such as divide by zero, inexact result, overflow, and so on. I recommend disabling SIMD exceptions within your shipping code, since the SIMD unit will provide reasonable values in situations where an exception would occur. However, in development code, it is useful to enable exceptions to see where (or if) they are happening and if they are important.

How do you detect the new exceptions? Unfortunately SIMD exceptions are not easily detected in current versions of Windows. The processor can take two paths when a SIMD instruction generates an exception. Either a new exception (protected mode interrupt #19 (decimal)) is raised or an invalid opcode is signaled. Whether an exception is raised or an invalid one occurs depends on the state of the OSXMMEXCEPT bit (bit 10) in control register 4. If the operating system supports the new exception, then it should set this bit. Otherwise, it has to leave it clear. No versions of Windows (except Windows NT 5.0 beta 2) can handle this exception, so SIMD exceptions will appear as invalid opcodes. Perhaps this will change with the upcoming second edition of Windows 98. If the operating system supports SIMD exceptions, the abstract exception passed through to Win32 applications is known as STATUS_FLOAT_MULTIPLE_FAULTS. Regardless of the exception generated, it is not trivial for an application to determine which of the floating-point values within a SIMD register caused the exception. The listing below returns a bool indicating whether or not the operating system supports SIMD (this program requires the Visual C++ macros described in this article).

bool DetectOSSupport()
{

bool support = true;
__try
{

_asm
{
//Execute a Streaming SIMD instruction
// and see if an
exception occurs.
ADDPS(_XMM0,_XMM1)
}

}

__except(EXCEPTION_EXECUTE_HANDLER)

{
// We should really check the reason for the
// exception in case
it is not an illegal
// instruction but any other exception is
// very unlikely.

support = false;

}

return support;

}

Detecting support for exceptions is difficult because of the need to change the SIMD control register. The listing below returns a Boolean representing operating system exception support. Again, this function requires the Visual C++ macros, which are described shortly.

bool DetectExceptionSupport()
{

bool exception_support = true;
float test_val[4] = {1.0f, 1.0f, 1.0f, 1.0f};
DWORD control;
__try
{

_asm
{
// Enable divide by zero exceptions by
// clearing
bit 9 in the SIMD control
// register.

push ebp
lea ebp,control
STMXCSR
and DWORD PTR [ebp], 0fffffdffh
LDMXCSR
pop ebp

// clear XMM0, all bits being 0 is 0.0 in
// floating point
lea eax,test_val

XORPS (_XMM0,_XMM0)
MOVUPS (_XMM1,_EAX)
DIVPS (_XMM1,_XMM0)
}

}

__except(EXCEPTION_EXECUTE_HANDLER)

{

// The divide by zero above has caused an
// illegal instruction exception so the
// OS must not support SIMD exceptions.

if (_exception_code() == STATUS_ILLEGAL_INSTRUCTION)
{

exception_support = false;

}

}

_asm

{

// disable the divide by zero exception
// again
or control, 0x200
push ebp
lea ebp,control
LDMXCSR
pop ebp

}

return exception_support;

}

More robust versions of the above functions and the DetectSIMD() function from the previous question have been put into an easy to use C++ class for your convenience, and are listed in Detect.cpp and Detect.h. An example using this class is provided in DetectExample.cpp. A pre-build version of this test application is available as DETECT.EXE.


What are these new SIMD instructions?
 


Home | Join | Help | Contact Us | Shop | Newswire | Site Map | Calendar
Write for Us | Features | Connection | Job Search | Directories


Copyright © 2000 CMP Media Inc. All rights reserved.
Privacy Policy