




|
By
Rob Wyatt
Gamasutra
July
9, 1999
Vol. 3: Issue 27

| |

Wyatt's World

Processor
Detection and a Pentium III Update
To an application there is
no difference between an internal and external FPU. The algorithm for
detecting the early FPUs relied on being able to read the contents of
the floating-point status and control registers. If valid data could not
be obtained from the control register then no FPU is present likewise
if these registers return valid data a FPU is present. The code within
this article will only detect the presence of an FPU -- not the FPU family.
Generally the family of the FPU is the same as the processor with the
exception that the 386 could use the external floating-point processor
from the 286, the 80287, or its own the 80387. I doubt this is going to
affect any modern application or game, but if you need to determine if
a 386 is coupled with an 80287 or 80387, you need to compare infinities:
the 80287 states that –inf = +inf, whereas the 80387 does not. The code
below will detect the presence of a FPU:
fninit //
reset FP status word
mov [status],
5a5ah // initialise temp word to non-zero value
fnstsw [status] //
save FP status word
mov ax,[status] //
check FP status word
cmp al,0 //
see if correct status with written
jne NO_FPU_PRESENT
The above code is the minimum
that is required. It checks to see if you can read the floating point
status register. To make the detection more reliable, the value obtained
from the status register should be validated. The code below does that
and follows on from the above.
fnstcw [status] //
save FP control word
mov ax,[status] //
check FP control word
and ax,103fh //
see if seleced parts look OK
cmp ax,3fh //
check that 1s & 0s correctly read
jne NO_FPU_PRESENT
..
Floating point
is present, go on to determine the family
if it is important
..
With the advent of the CPUID
instruction, detecting the processor and feature set became a whole lot
easier. Detecting the presence of an FPU became as simple as checking
a single bit. It also became possible for other manufacturers such as
AMD and Cyrix to uniquely identify their processors instead of being associated
with their equivalent Intel parts. Whether or not this is important to
a given application largely depends on the level and type of optimizations
it uses, and the game’s utilization of specific features or instructions.
If you are writing system code that is going to use the Model-Specific
Registers (MSRs), then it is essential that you know the exact make and
model of the processor on which the code is running. MSRs can change between
steppings of the same model of processor.
To use the CPUID
instruction, you pass it a function to perform in the EAX
register. The results are then returned in the EAX,
EBX, ECX, EDX registers. The range of functions that is supported
by CPUID is the first piece
of information that must be determined. Fortunately, there is a documented
method of doing this so no assumptions have to be made about make and
model. You call CPUID with
EAX=0 and after executing
the instruction, EAX will
contain the maximum supported CPUID
function. CPUID function
0 is also used to determine the manufacturer of the processor after execution.
EBX, EDX, ECX (note the
order) contain a 12-byte string that is unique for every manufacturer,
called the vendor string, and the currently known ones are listed below:
|
Manufacturer
|
Vendor
String
|
|
Intel
|
GenuineIntel
|
|
AMD
|
AuthenticAMD
|
|
Cyrix
|
CyrixInstead
|
|
IDT
|
CentaurHauls
|
CPUID
function 1 is used to provide the signature and feature set and is supported
by all processors that have a CPUID
instruction. The remaining functions are only supported by Intel, although
this may change in the future. These Intel-only functions provide information
on cache sizes, and on the Pentium III they provide the serial number.
I am not going to document the whole CPUID
instruction here because it is fully documented by each manufacturer,
and links their respective web sites are given at the end of this article.
The table below should serve as a summary on the available CPUID
functions:
|
Function
|
Description
|
Supported
by
|
EAX
|
EBX
|
ECX
|
EDX
|
|
0
|
Signature/
Vendor Sting
|
ALL
processors with CPUID
|
Highest
CPUID
|
Vendor
0-3
|
Vendor
8-11
|
Vendor
4-7
|
|
1
|
Features
|
ALL
processors with CPUID
|
Signature
|
Reserved
|
Reserved
|
Feature
Set
|
|
2
|
Configuration
|
Intel
Family 6
|
Config
Data
|
Config
Data
|
Config
Data
|
Config
Data
|
|
3
|
Serial
Number
|
Intel
Pentium III
|
Reserved
|
Reserved
|
Lower
32 bits
|
Middle
32 bits
|
AMD has supported the CPUID
instruction since the AM486 DX4, and its initial support provided just
the first two functions: Vendor String and Features. AMD modified the
CPUID instruction with
the introduction of the K5 processor by adding extended functions to provide
information not beyond what the standard CPUID functions supply. AMD would
have had trouble adding functions with numbers that might clash with Intel’s
plans, so the company added new functions starting at function code 0x80000000.
The extended functions work just the same way as the standard functions,
and like the standard functions you call CPUID
0x80000000 to obtain the
maximum supported extended function. These functions are very important
because they provide the only way of detecting 3DNow! support. The table
below gives a summary of all the AMD extended functions:
|
Function
|
Description
|
Supported
By
|
EAX
|
EBX
|
ECX
|
EDX
|
|
80000000
|
Extended
CPUID Range
|
ALL
|
Highest
extended CPUID
|
Reserved
|
Reserved
|
Reserved
|
|
80000001
|
Extended
Features
|
ALL
|
Extended
Signature
|
Reserved
|
Reserved
|
Extended
Features
|
|
80000002
|
Name
0
|
ALL
|
Name
|
Name
|
Name
|
Name
|
|
80000003
|
Name
1
|
ALL
|
Name
|
Name
|
Name
|
Name
|
|
80000004
|
Name
2
|
ALL
|
Name
|
Name
|
Name
|
Name
|
|
80000005
|
Level
1 Cache
|
ALL
|
Reserved
|
TLB
Info
|
L1
D Cache
|
L1
I Cache
|
|
80000006
|
Level
2 Cache
|
K6III,
K7
|
Reserved
|
Reserved
|
L2
Cache
|
Reserved
|
Extended functions may be provided
by any processor manufacturer and they may all be different. At the time
of this writing, any processor that supports AMD’s 3DNow! also supports
the AMD extended features listed above. Intel does not support extended
functions, and unfortunately there is no documented way of detecting the
presence of these extensions short of knowing what makes and models provide
them. On an Intel processors, the CPUID
function number is clamped to the maximum standard function range, although
this fact is not documented. If the maximum supported CPUID
is function 2, then the results of CPUID
function 0x80000000 will
be the same as function 2.
Another way to detect the extended
functions is to call CPUID
function 0x80000000, and
EAX should indicate the
maximum supported extended function. This result should be greater than
0x80000000 -- not equal to it. If it is equal to 0x80000000,
check to see if any other registers have changed. Finally, put the whole
check in a try/except block in case an invalid opcode exception is thrown.
With all of the tests listed here, you can be pretty certain that the
extended functions are present.
The CPUID
instruction has been in the Visual C++ assembler since version 5.0, and
has been in all versions of the Intel compiler. Visual C++ 4.x can emit
the byte opcodes (0f, a2) directly into the instruction stream with the
following macros for C/C++. A macro is also provided for older versions
of MASM.
C/C++ Macro:
#define CPUID _asm _emit 0x0f _asm _emit 0xa2
MASM
Macro:
CPUID MACRO
db 0fh
db 0a2h
ENDM
Most applications will only
require the features of a processor, not the make and model. But these
features have to be properly identified. It is very bad practice to assume
anything about the feature set of a processor from its make and model.
For example, a lot of early MMX games failed to run on an AMD K6 processors
because the code only looked for the MMX instructions if an Intel processor
was detected. When detecting MMX you probably are not concerned with the
fact the processor may be a Pentium, Pentium II, Pentium III or even an
AMD. As such, you should only prevent applications from running if they
are going to crash due to invalid instructions. I recommend future proofing
your game by allowing it to run if a given processor has the features
required for executing the application, even though the application is
not specifically optimized for that processor. This will become more and
more important as manufacturers add their own technology to their processors
and then license that technology to other manufacturers.
For example, will AMD or Cyrix
implement the instructions that were added to the Pentium Pro, such as
cmov, fcmov and
ficomp? Even more important,
are any other manufacturers going to implement the Pentium III streaming
SIMD instructions or the 3DNow! instructions? In any case, your game should
not fail to run on a new processor just because the game was released
before the new processor was available.
To help minimize detection
problems, the class provided has a set of generic flags for specific instruction
set extensions. By using these flags, you can trivially check for 3DNow!
or Pentium III SIMD instructions without knowing or caring about the make
and model. The extensions are returned in the upper bits of the get_features()class
member, and the following flags are defined: CPU_PENTIUM,
CPU_PENTIUMPRO, CPU_3DNOW and CPU_SIMD.
To use these flags, simply check for the instruction set that the code
base uses. The function below is an example of the sort of check that
should be performed when running a game that uses 3DNow! instructions.
This function will return true on any processor that is capable of executing
3DNow! Instructions, indicating that execution should continue. Otherwise,
false will be returned and the application should report a friendly message
to the user. Note the code does not check for an AMD processor, otherwise
it would fail on an IDT WinChip2.
bool will_execute_3DNow()
{
ProcessorDetect proc;
Uint32_t features;
if (!proc.get_features(0,
&features))
{
// an error occurred
so assume NO just to be safe
return false;
}
// return the status
of the 3Dnow flag
return (features &
CPU_3DNOW) != 0;
}
To be really paranoid, the
above function should check that the required instructions are available
on each and every processor in the machine.
Listed below is a list of suggestion
that, if followed, should make your code run on the widest variety of
platforms, future and present. Some of the items in this list are discussed
in other parts of this article.
- Do not rely on invalid opcode
exceptions to detect processor features. Undocumented instructions may
be present in one model of processor, and the same opcodes are used
for a completely different purpose in a later model. All features in
new processors have an accepted method of detection, generally this
will be the CPUID feature
flags.
- Do not use undocumented
or testing features of any processor and do not rely on the contents
of undocumented registers or flags. These features are likely to be
removed or changed in later models, which may not necessarily result
in a change to the processor signature.
- Do not assume anything about
the feature set. For example, do not assume that a Pentium has an FPU.
Also do not assume that a later model such as a Pentium II has all the
features of an earlier model, such as a Pentium Pro. These are obviously
different because a Pentium Pro does not support MMX.
- Do not assume anything about
the clock speed of a processor from its make and model, and do not write
speed-dependent timing loops. Instead, use the RDTSC timer to calibrate
and time critical code section.
|