For a few years now, many of us have been using Microsoft's OpenGL under Windows 95 and Windows NT. Sadly, the performance of this implementation has occasionally lagged Direct3D and worse yet, hardware drivers for Windows 95 have been absent or delayed due to confusion about Microsoft's delivery of mini client drivers (MCDs) for Windows 95. The MCD is Microsoft's solution for Windows NT drivers and provides an easy to use (albeit low performance) solution for device driver development. At the 1997 CGDC, it became clear that plans for a Windows 95 MCD architecture were indefinitely delayed. This was disappointing to many game developers, IHV already developing MCDs and, I understand, to Silicon Graphics (SGI) as well.
This article will provide an update on the current state of OpenGL, with specific emphasis on the SGI's OpenGL. I will also examine the new SGI OpenGL DDK, the resulting hardware driver development process, and discuss how it impacts the performance of your application code.
Beginning in the Spring of 1995, SGI began an independent development process to produce their own OpenGL implementation for Windows. A key SGI Windows product, Cosmo Player, was suffering from the slow performance, making VRML viewing somewhat sluggish. Because of this, SGI's original Windows implementation of OpenGL was called "Cosmo OpenGL" in the hopes that it would significantly improve the performance of the Cosmo Player. SGI also wanted OpenGL to be a viable API under Windows, which motivated them to provide a high performance sample implementation for licensees. Tremendous energy went into developing this optimized version of OpenGL, including run-time code generation (including massive amounts of assembly code), proposals to the OpenGL Architectural Review Board (ARB) for high-performance API extensions, and the inclusion of tricks previously reserved for high-end SGI workstations.
Over time, as the code was being developed, increasing interest in the games market developed within SGI. This interest was amplified by Chris Hecker's open letter to Microsoft (see the April/May and June 1997 issues of Game Developer for more information ), public commentary regarding OpenGL and the ensuing Direct3D vs. OpenGL debate.
In February of 1997, SGI released a software-only renderer, the beta version of Cosmo OpenGL. This software-only renderer used an optimized rasterizer (more about this later) and outperformed Microsoft's OpenGL for triangle rendering in common rendering contexts. In July, SGI released MR1, which included MMX acceleration.
Understanding that software performance is never enough, in October SGI released a Windows 95/NT device driver kit (DDK) for independent hardware vendors (IHVs) that let 3D hardware vendors quickly develop OpenGL drivers for their accelerators. This DDK was designed to be easy to use, readily available at little or no cost, and provided all the code necessary to produce a high-performance OpenGL driver. Most major graphics card vendors are expected to begin providing OpenGL drivers for Windows 95 now that this DDK is available. Currently SGI is developing a DirectDraw-compatible OpenGL, which will allow game developers not only to access DirectDraw simultaneously with OpenGL, but will also make testing transition code from Direct3D easier.
Microsoft uses an OPENGL32.DLL file for run-time execution. In a somewhat confusing move, SGI delivers an OPENGL.DLL file for the same purpose. In order to use SGI's OpenGL for Windows, a game developer must explicitly link with SGI's libraries and use its header files. There are two reasons for this: the SGI OpenGL contains a set of performance extensions which don't exist in Microsoft's implementation, and the executable header contains the name of the .DLL required at run time.
A little trick comes out of this arrangement. Sometimes it is helpful to test the SGI OpenGL without relinking. This is especially beneficial when you do not have access to linkable code. Frankly, it is also helpful when you just don't want to take the time to relink when performing operations like benchmarking. The OPENGL.DLL may be simply renamed OPENGL32.DLL and used with an application linked with the Microsoft OpenGL libraries. If the game uses any SGI-specific extensions, the code should gracefully handle a fallback method if the SGI .DLL is not detected by the application.
From this observation, it becomes apparent that SGI could have just installed OpenGL for Windows right on top of Microsoft's OpenGL. However, SGI took a more Microsoft-friendly approach. The other benefit to this approach is that the game developer can be assured that the SGI-specific extensions are available when OPENGL.DLL is available. As an aside, your game can determine which OpenGL it is rendering through by using multiple calls to glGetString with the arguments GL_VENDOR, GL_RENDERER and GL_VERSION after you successfully bind a context. If you check these values before context creation, you may get incorrect results back.
ICD - Installable Client Driver, contains the entire rendering pipeline of OpenGL. This solution, while providing the highest possible performance, was also daunting to IHVs. The Microsoft ICD kit required considerable effort to turn into a driver. SGI's DDK is also an ICD, but comes with a sample driver (for Virge/GX) as an example.
Co-Residence of OpenGL Implementations
The Microsoft and SGI OpenGL implementations may co-reside on a single PC. Which OpenGL rendering pipeline is used depends on how the application was linked and what (if any) 3D hardware is installed.
Let's briefly examine the OpenGL rendering pipeline architecture, which you can see in Figure 1 (below). OpenGL may be present in up to three different rasterizing configurations: software only, installable client driver (ICD) and MCD. Further, two vendors (Microsoft and SGI) now offer ICD kits for IHVs as well as software-only rasterizers. As a result, your rendering may traverse up to five different paths (or up to four paths on any given machine due to MCD under Windows NT). For Windows 95, we need concern ourselves with three common paths: Microsoft software-only rendering, SGI software-only rendering, and ICD. IHVs may choose to offer alternate paths, such as an ICD-less path direct to hardware (path D in Figure 1), but I doubt it will be commonplace.
Figure 1. Various OpenGL Rasterization Paths. (*Note that the window and context management contained in the wgl functions are independent of the rendering pipeline and remain so, regardless of the specific configuration.)
The software rasterizer is present in all configurations and is shown as path A in Figure 1. This is required, even when 3D hardware is present, in order to handle pixel formats and states which are not supported by the 3D hardware. For example, the 3D hardware may not support per-pixel fog. In this case, rendering would revert to software rasterization, even though 3D hardware is present.
The software rasterizer in OpenGL for Windows has been aggressively tuned for maximum triangle rendering performance. The actual rasterization code is produced at run time using a code generator and takes advantage of special resources (like MMX) when they are present.
The ICD is the standard solution for 3D hardware support using OpenGL (path B in Figure 1). While it is possible to produce a hardware driver without ICD support, it is not recommended for shipping products. As part of the driver development process, it is natural to provide ICD loading last, while developing code for context creation and triangle drawing first. Completion of the driver to include the ICD loading allows applications linked with the Microsoft OpenGL library to properly execute on your hardware.
The MCD only operates in the Windows NT environment (path C in Figure 1). MCD was designed as an abstraction of the rasterization layer. A couple dozen functions provide support for pixel format management, texture handling and the drawing of primitives. A special structure, the MCD command buffer, is used to pass data across the I/O layer from the user mode to the kernel mode. Most of the actual MCD code operates in the kernel mode, presumably to maximize use of limited data bandwidth across the I/O layer into kernel mode.