Linux in Game Development

by:

Bernd Kreimeier

Senior Programmer
Loki Software, Inc.

Sam Lantinga

Lead Programmer
Loki Software, Inc.

Keith Packard

XFree86 Core Team Engineer
SuSE, Inc.

Daryll Strauss

Practice Lead for Contract Engineering
VA Linux Systems

Michael Vance

Software Engineer
Treyarch, LLC

Table of Contents
Introduction
Linux as a Development Platform
Outline of the Tutorial
The Linux Toolchain
Games and XFree86
Simple DirectMedia Layer
OpenGL for Linux
Linux audio with OpenAL
Tutorial Schedule
The Linux Toolchain
Filesystems
Multi-user environment
Run-time linking
Inline assembly
MASM / Inline MSVC
NASM assembly modules
GAS / Inline GCC
Microsoft C/C++ vs. GNU C/C++
Games and XFree86
Introduction
XFree86 2D APIs
X Core Protocol
Shared Memory Extension
Xv Extension
X Rendering Extension
XFree86 Device Input
XFree86 Mouse Support
XFree86 Keyboard Support
The DGA Extension
DGA 1.0
DGA 2.0
Conclusions
Simple DirectMedia Layer
Description
What is it?
What can it do?
What platforms does it run on?
OpenGL for Linux
Sources of OpenGL for Linux
OpenGL Implementations: Mesa
Mesa
Mesa Software Rendering
Mesa Software Rendering for X
Mesa Hardware Acceleration
Mesa + Glide
OpenGL Implementations: Utah GLX
Mesa + Utah GLX
Utah-GLX Hardware Support
Utah-GLX
OpenGL Implementations: DRI
Direct Rendering Infrastructure (DRI)
Indirect versus Direct Rendering
Goals of the DRI
Myths about the DRI
Architecture of a DRI Driver
DRI Hardware Support in XFree86 4.0 (and later)
Voodoo3 / 16-bit RGB, 16-bit Z buffer
Matrox G200, G400
ATI Rage 128
ATI Radeon
Intel i810
3Dlabs Oxygen
Future DRI Work
Integrating X: GLX, Qt, Gtk+, Toolkits
Basic Steps for OpenGL Widgets
GLX
OpenGL with Qt
OpenGL with GTK+
Toolkits
Benefits of Open Source
Conclusion
References
Spatialized Audio with OpenAL
Overview
What is the OpenAL Audio System?
Programmer's View of OpenAL
Implementor's View of OpenAL
Our View
A Quick Tour
Listener
Sources
Buffers
Comparison to DS3D and I3DL2
Implementations and Drivers
Status and Outlook
References
Online Version of Tutorial
Credits

Introduction

Linux as a Development Platform

For years, Linux has proven itself as the backbone of Internet gaming, providing thousands of servers for games such as the Quake titles or Half-Life. Competitive hardware drivers combined with its robust and versatile operating system make the Linux desktop a professional target platform, covering the range from SGI's Visual Workstations to PCs, settop boxes, and embedded systems. Participants can expect a thorough review of development on and for Linux, distilled from the hands-on experience of the Loki coders who ported titles such as Myth 2, Heretic 2, Heavy Gear 2, and Civilization: Call to Power to Linux. Topics include the Linux state of the art, cross-platform development, Linux as a server platform, OpenGL graphics for Linux, streaming media and 2D graphics, digital and spatialized audio, input devices, portable data formats, and cooperative development.

Outline of the Tutorial

The presentations in this tutorial describe Linux as a development platform, starting with an introduction to the most relevant parts of the operating system and environment, and proceeding to a more detailed explanation of the architecture and its implications for portability, performance, and maintenance. The tutorial has five major parts plus a brief introduction.

This tutorial does not discuss the merits of Linux as a target platform and market. Instead, the presentations aim to specify Linux utility and viability as a development platform. While this might primarily be of interest for backend service solutions (dedicated and master servers), the costs of creating and maintaining a full implementation of a game on the Linux platform are often more then offset by the increased robustness and resulting portability of the codebase, especially for projects that include optional ports to platforms like PSX2 or MacOS X. In this respect, Linux as the second (or even primary) development platform is particularly attractive due to its unparalleled transparency to the developer, and the price point.

The Linux Toolchain

Michael Vance describes the Linux toolchain for software development: compiler, linker, system libraries, support for C, C++, assembly. In particular, this presentation will cover common pitfalls of single platform (i.e. primarily Win32-only) development that increase the costs of post-shipment ports to other platforms. The list of example cases is relevant not only for game projects that require dedicated servers or tools (CGI, preprocessing) under Linux/UNIX, but also for console ports that use derivatives of the GCC compiler (PSX2). This presentation is based on the experience of nearly two years of commercial porting at Loki (from Win32 and Mac, to Linux and other UNIX platforms). It also takes into account insights from ongoing cooperative development projects (where the Linux port is done in parallel to the Win32 development).

Games and XFree86

Keith Packard describes the XFree86 implementation of the X11R6 X Windows environment for Linux i386 and other platforms, including X extensions, and XFree86 specific extensions. This presentation provides the basics of window creation and management, retrieval and handling of input events, and drawing of 2D graphics. XFree86 is also the foundation with which hardware accelerated 3D graphics is integrated.

Simple DirectMedia Layer

Sam Lantinga introduces the platform abstraction provided by the open source SDL library. His presentations shows sample code to interact with XFree86, taken from the SDL implementation for Linux. In comparing the UNIX implementation with the SDL implementation for Win32 (or other supported platforms), developers will be able to determine which platform abstraction is needed for their project, and how idioms from DirectX map to their XFree86/UNIX equivalents. SDL can also be used as a toolkit for fast prototyping, or an alternative to GLUT. With few exceptions, all games ported by Loki use SDL.

OpenGL for Linux

Daryll Strauss' presentation gives an overview of the availability and state of OpenGL implementations for Linux. This includes software-only solutions based on the Mesa open source project, which are not only a valuable reference, but might also offer benefits for debugging. His presentation will focus on the DRI implementation of GLX for XFree86, which is also available as open source and covers a wide range of hardware accelerators.

Linux audio with OpenAL

Bernd Kreimeier presents the progress of a joint effort between Loki and Creative Labs to create a vendor neutral, cross-platform API for spatialized audio, OpenAL. The OpenAL reference implementation for Linux, maintained by Loki, is used in many Linux ports of games that Loki shipped in 2000, and work is under way to support more platforms including hardware support.

Tutorial Schedule

The schedule below is dicated by the official schedule for coffee and lunch breaks. It is most likely that we will push back each coffee break by about 15mins, or (depending on the final version of the presentations) might have to change the order.

Table 1. Tutorial Schedule

Topic Time Speaker
Introduction 10:00am - 10:05am Bernd Kreimeier
Linux Toolchain 10:05am - 11:00am Michael Vance
AM Coffee Break 11:00am - 11:15am -
XFree86 for Games 11:15am - 12:30pm Keith Packard
Lunch Break 12:30pm - 2:00pm -
Platform abstraction in SDL 2:00pm - 3:00pm Sam Lantinga
3D Graphics for Linux 3:00pm -4:00pmDaryll Strauss
AM Coffee Break 4:00pm - 4:15pm -
OpenAL audio 4:15pm - 5:15pm Bernd Kreimeier
Panel/Questions 5:15pm - 6:00pm All Speakers

The Linux Toolchain

In porting to Linux, mapping Win32 API's to UNIX API's (e.g. D3D to GL) is often less problematic than other tasks. The only way to attempt portable code is to actually develop, compile and link on multiple platforms.

This section describes the Linux environment (multi-user, protected filesystem) and toolchain (compiler, linker etc.), the differences to their Win32 equivalent, and how these affect the portability of code.

Filesystems

  • FOOBAR.TXT vs. foobar.txt vs. foobar.TXT vs. fooBar.txt

  • Toolchain (ie, #include), and file loading

  • Just downcase? Nope, bad if absolute paths

  • Relative paths as a solution? Nope, need $HOME to write

  • Beware explicit paths in CRC verified pack files (sv_pure)

Multi-user environment

  • File permissions and per-user settings

  • You can't piddle all over the drive

  • Most Win32 and MacOS apps are not designed with per-user specifics in mind

  • Linux doesn't provide a globally writable registry (passwords)

  • Hardware accessible through devices (audio permissions)

Run-time linking

  • ELF format

  • Win32 run-time linker vs ld.so

  • "Hierarchical" linking under Win32

  • GCC/ld missing some duplicate symbols

  • MSVC forgives all: extern and static for the same symbol

  • ld.so -- repeated loading/unloading ref count might fail

  • ld.so - thread-safe?

  • Beware: glibc ABI breakage

Inline assembly

Inline assembly is a porting and maintenance problem. Some projects have had more than 20,000 lines of it. Inline MASM can be converted to inline GAS. Assembly code that is maintained separately can be handled by NASM.

MASM / Inline MSVC

As a simple example: fast ftol() from GPL Quake1. More complicated examples include the pipelined FDIV span blitting code in the Q1 software rasterizer (8bit palette color) and its 16bit equivalent in Heretic2 (both ported to Linux). This example does not illustrate the difficulties of assembly interleaved with C instructions, which can not be handled by NASM unless the C is converted to assembly.

Example 1. MASM ftol

    #pragma warning (disable:4035)
    __declspec( naked ) long Q_ftol( float f )
    {
            static int tmp;
            __asm fld dword ptr [esp+4]
            __asm fistp tmp
            __asm mov eax, tmp
            __asm ret
    }       
      

NASM assembly modules

Fairly trivial porting of code in separate modules. It uses Intel-style syntax like MASM.

Example 2. NASM ftol


    segment .data
    temp dd 0.0
    segment .text

    global Q_ftol
    Q_ftol:
            fld dword [esp+4]
            fistp dword [temp]
            mov eax, [temp]
            ret
       
       

GAS / Inline GCC

AT&T style syntax, with heavy sugar. If assembly instructions are interleaved with C code, the syntax conversion from MSVC inline assembly to GCC inline assembly will force duplication of source code and time consuming translation.

Example 3. GAS ftol

    inline long Q_ftol( float f )
    {
        static long temp;

        __asm__ __volatile__( "flds %1    \n\t"
                              "fistpl %0  \n\t"
         : "=m" (temp)
         : "m" (f)
        );
        return temp;
    }
       

Microsoft C/C++ vs. GNU C/C++

MSVC vs. ANSI C/C++. Aside from differences between the two compilers (MSVC,GCC) that are legitimate with respect to the ANSI specifications, MSVC also extends/contradicts ANSI in some simple and some very subtle cases.

  • Anonymous structs/unions

  • #pragma

  • Sequence points

  • Unnamed temporaries

  • Illegal Promotion

  • STL vs. STLport

  • FPU Controlword

  • typecast operator in vararg

  • Misc. Evil

Anonymous struct/unions

Anonymous structures are not defined for either C or C++. Anonymous unions are not defined for C. MSVC allows both in either language. GCC now has an extension to partially support these, but for portability reasons these MSVC extensions should be avoided. A typical use is to access the same memory area in different symbolic ways, which might fail on non-Intel hardware.

Example 4. Anonymous Struct/Union

    typedef struct {
        union {
            unsigned int col;
            struct {
                unsigned char r, g, b, a;
            };
        };
    } color_t;       
       

#pragma

  • Compiler specific, inherently non-portable

  • #pragma once is biggest offender

  • #pragma push:pack(1) for structure alignment, then fread/fwrite the structures to disk. Use __attribute__ ((packed))

  • Patch compiler to accept MSVC pragma

  • Also used to turn off common warnings

Sequence points, order of evaluation

The ANSI specification does not determine certain details on how to interpret a given C instruction.

  • Sequence points: ++x * ++x

  • Function calls: foo( nextfoo( ), nextfoo( ) )

  • Undefined order of operation

Unnamed temporaries

Unnamed temporaries may only be bound to const references. MSVC breaks this in non-strict mode.

Example 5. Unnamed temporaries


    class A { };

    #ifdef __GNUC__
    void foo( const A& a ) { }
    #else
    void foo( A& a ) { }
    #endif

    int main( int argc, char* argv[] )
    {
            foo( A( ) );
    }
       
       

Illegal promotion

MSVC sometimes allows illegal promotions:

Example 6. Illegal promotion


    class R {
            public: operator std::string() const;
    };

    class I {
            public: I( string& s );
    };

    void foo( void ) {
           R r;
    #ifdef __GNUC__
           I i( r );
    #else
           I i = r;
    #endif
    }
       
       

STL vs. STLport

MSVC's STL differs from the STL standard. The alternative, STLPort, is considered more correct and better implemented. STLPort exists for multiple platforms, incl. Win32 and Linux. URL: http://www.stlport.org/.

As an example: list<>.size() is O(1) in MSVC STL, O(n) in others. Code that removes from a list using list<>.size() in the termination clause will be slow with other STL implementations. Use list<>.empty() instead. Interestingly enough, Alexander Stepanov states: "size() used to be linear time in case of STL lists. It was the right decision since if a user wants to keep a count it is easy to do, but usually you do not need it and it makes splice linear time (it used to be constant). But the standard committee insisted that I change it to constant time. I had no choice. We will have to change this requirement eventually."

FPU Controlword

NaN is unordered (not a float), so any comparison involving NaN will be false, except for the not equals comparison (!=) which returns true. NaN will slip through range checks like

        if ( maybe_nan < min )
          return;
        if ( maybe_nan > max )
          return;
        int ptr_offset = (int)(maybe_nan*SCALE);
     

To catch divide by zero, range overflow/underflow etc., unmask the respective SIGPFE. This is straightforward under Linux, but DirectX/Direct3D seemingly implement persistent changes of the FPUCW. Even if the respective flags are used (cooperative level), Win32 interferes with SIGFPE's. This is one example where simultaneous development and testing under Linux can save significant amounts of time.

Miscellaneous Evil(tm)

  • Win32 memory manager is very forgiving--double frees, etc.

  • Win32 memory is initialized

  • Win32 applications serialize (savegames) writing function pointers to disk (eek)

  • Wide strings (Unicode? Ha!)

  • SMP/threading issues

Games and XFree86

Introduction

Developing interactive graphics applications using the XFree86 window system benefits from detailed architectural knowledge about how the XFree86 code works inside. This paper presents an overview about what's available in the current implementation along with information about how those pieces are constructed and how they interact with the rest of the window system.

XFree86 2D APIs

XFree86 includes several mechanisms for presenting information on the screen.

X Core Protocol

The original X protocol, developed in 1987 at MIT uses a traditional rasterop imaging model along with PostScript inspired geometric primitives. The result is something which serves simple office applications well enough, but which falls far short of the requirements for many 2D games.The basic rasterop fill/blt model contains many useful operations but lacks some significant operations.

One obvious omission from the core protocol is in transparent blts; there's no way to draw a sprite on the screen without using two blt operations, one to clear the area out and a second to paint the object.This ends up slower than the usual pixel-keyed blt operation and also can generate screen artifacts.

Another problem is that image transmission from the application to the screen must pass through a network connection; even a local socket takes a significant amount of time to transmit large images.This is exacerbated when transmitting images from the server back to the application by the round trip latency.

However, for operations which X does support, performance is uniformly excellent, largely due to the XFree86 4.0.2 XAA architecture which makes it easy to accelerate a wide variety of common operations with a few short device-specific functions.Because of this commonality, operations which are fast on one card are usually fast on all cards.

XFree86 4.0.2 provides a common video memory allocation scheme which places "reasonable" pixmaps in off-screen memory. This allows rendering using the accelerator as well as dramatically improving the performance when moving data between pixmaps and the screen.

Where it works, the core protocol is fast and efficient. Additionally, if your application runs efficiently using only the core protocol, it's likely to perform well over a network.

Shared Memory Extension

To ameliorate the performance impact of the network connection, the MIT-SHM extension allows the bulk image data to be transmitted between the application and the window system in shared memory. As this requires no user-to-kernel-to-user copy, performance is dramatically increased.

The shared memory extension can also be used to share pixmap data between the client and server. As the data reside in regular system memory, there's no performance benefit when doing bulk image transfers. The only advantage is that core rendering requests can be used and the results are directly observable by the application in the shared buffer.

All X servers support this extension so applications can count on it's presence. However, when a network separates the application from the window system, the application must fall back to regular images. As the semantics of shared memory data differ from the network transport, there's no easy way to hide the failure from the application and paper over it within the library.

Xv Extension

The Xv extension was designed way back in 1991 by Dave Carver in response to the overly complicated VEX extension. It assumes that video data will be digital and provides a simple PutImage-style model for moving that data from the application to the screen.

Xv supports RGB and Yuv image formats. Most modern graphics chips can directly accelerate the scaling and display of these formats on the screen, usually by way of an "overlay" where the hardware stores the image in Yuv format and automatically substitutes portions of the image using a color key value. Some hardware can use Yuv texturing support and write an RGB representation of the image to the regular frame buffer. This allows for multiple video applications to run at the same time, but is somewhat slower.

When the video data lies in a separate overlay, the video chip converts the data as it is displayed on the screen so no RGB representation is available to the application.

Lots of XFree86 drivers support Xv today:

  • Alliance Promotion

  • ATI R128 and Radeon

  • Chips and Technology

  • Glint permedia 2,3

  • Matrox G200/G400

  • S3Virge

  • 3dfx

  • Trident (as of CYBER9397 or above)

  • Silicon Motion

  • S3 Savage

X Rendering Extension

The X rendering extension provides an additional rendering model based on image composition using alpha values. The basic operation is:

dst = (src IN mask) OP dst

OP can be any of the Porter/Duff binary image operators like 'OVER' along with a few extra ones gleaned from OpenGL.

Because the basic imaging model supports an 'alpha' channel, applications can present images that include transparency or even translucency. This may prove useful in some game application design.

To ease application development, Render requires that all X servers support a 32bpp ARGB image format; the commonality allows applications to use a single image model instead of needing to support every possible image format.

Shared memory pixmaps may well prove useful in this environment; Render doesn't currently perform any image transfer operations on its own, relying instead on the core routines.

While much of the render extension has stabilized and is useful today, the implementation currently lacks support for polygons and affine image transforms.

XFree86 Device Input

The X server mediates all access to the keyboard and mouse whenever it is active. This means that even OpenGL and DGA applications must understand how the X input mechanisms affect input available to them.

For XFree86 4, a new input device architecture was written allowing alternate input devices like joysticks, touchpads, tablets and other stuff. This new architecture also includes support for the mouse, but not for the keyboard. The current keyboard code is quite a mess having been hacked on for about a decade. This means that input support for the mouse and keyboard are significantly different in behavior and performance.

XFree86 Mouse Support

Input from mice and other non-keyboard devices are immediately read by the X server using SIGIO. As soon as the current X request is finished, the pending input is delivered to waiting applications. This minimizes the latency between mouse motion and application receipt as much as possible in a single-threaded X server. Typical latencies are well below 1ms; the only variable under application control is the maximum amount of time spent executing a single X request as the X server always finishes any executing request before delivering events.

XFree86 Keyboard Support

The keyboard driver is completely different; input is only noticed when the X server hits 'select' and notices that the keyboard file descriptor has pending data available. The input are read and delivered immediately. When the X server is idle, the input are read almost immediately. However, when the X server is busy executing requests, it only checks for available keyboard data every 20ms. While this is significantly better than XFree86 3.3, it still represents a significant latency problem.

The bulk of the alternate input device support is for graphics tablets. The existing code to support Linux joysticks is quite broken. If your application needs access to such devices, it must use the kernel interfaces directly. This has the additional benefit of reducing latency in many cases.

The DGA Extension

The Direct Graphics Access extension allows applications to get the window system out of the way and completely control a full-screen display.

DGA 1.0

Version 1.0 of the DGA extension provides

  • Direct framebuffer access

  • Relative mouse motion events

  • Screen viewport control for panning and double-buffering

  • Full-screen access only

DGA is useful for applications that must both take of the entire screen and also bypass the X rendering model to go straight to the frame buffer. This can provide significant acceleration for many applications, but the price is a lack of window support, and with DGA 1.0, a lack of hardware acceleration.

DGA provides controls for setting the portion of the hardware frame buffer which is displayed on the screen. This allows hardware panning and also double buffering. It also provides a bypass mechanism for the hardware colormap. Normally the X window manager is in charge of getting the right colormap installed, but in full-screen mode, the window manager cannot perform this task.

DGA 2.0

Version 1.0 was developed outside of XFree86 and incorporated into several 3.3 based releases. With XFree86 4, the new driver architecture required significant changes in how DGA operates and a new implementation along with an enhanced extension were developed. In addition to the above list, DGA 2.0 provides:

  • Hardware acceleration

  • Xlib rendering

  • New events

  • Mode selection

DGA 2.0 provides three rendering primitives that are accelerated by the driver, video-memory copies, solid fills and pixel-keyed transparent copies. They operate only in full screen mode and so windowed applications can't take advantage of the transparent copy.

DGA 2.0 can also provide rendering using any Xlib (or extension) drawing commands. It does this by exporting the entire frame buffer as one big pixmap that can be passed to regular Xlib rendering functions. This allows acceleration of lots of operations. While this is an optional feature, it's supported in all current DGA drivers. However, it's not available when Xinerama is enabled and so applications should check whether it can be used.

While DGA 1.0 smashed the core X mouse events replacing absolute positions with relative ones, DGA 2.0 takes the more reasonable approach of creating new event types to pass relative mouse motion events. Internally, this simplifies things greatly while also avoiding application interaction problems when sticking DGA code alongside non-DGA code.

DGA 2.0 also supports screen depth and size switching. This allows applications to switch to a supported mode instead of requiring the X desktop be restarted using the parameters necessary for that application. Xlib rendering is disabled when the depth doesn't match the X depth.

Conclusions

X provides a stable standard base for building and porting games for Linux. There are lots of useful features in the core protocol, but applications should take advantage of the pervasive XFree86 extensions to improve both performance and appearance.

Simple DirectMedia Layer

Description

What is it?

SDL is a free cross-platform multimedia development API.

  • Used for games

  • Used for game SDKs

  • Used for emulators

  • Used for demos

  • Used for multimedia applications

URL: http://www.libsdl.org/.

What can it do?

  • Video

  • Events

  • Threads

  • Timers

  • Endian independence

  • Audio

Video

  • Set a video mode at any depth (8-bpp or greater) with optional conversion, if the video mode is not supported by the hardware.

  • Write directly to a linear graphics framebuffer.

  • Create surfaces with colorkey or alpha blending attributes.

  • Surface blits are automatically converted to the target format using optimized blitters and are hardware accelerated, when possible.

  • MMX optimized blits are available for the x86.

  • Hardware accelerated blit and fill operations are used if supported by the hardware.

Events

Events provided for:

  • Application visibility changes

  • Keyboard input

  • Mouse input

  • User-requested quit

  • Each event can be enabled or disabled with SDL_EventState().

  • Events are passed through a user-specified filter function before being posted to the internal event queue.

  • Thread-safe event queue.

Threads

  • Simple thread creation API

  • Simple binary semaphores for synchronization

Timers

  • Get the number of milliseconds elapsed

  • Wait a specified number of milliseconds

  • Set a single periodic timer with 10ms resolution

Endian independence

  • Detect the endianness of the current system

  • Routines for fast swapping of data values

  • Read and write data of a specified endianness

Audio

  • Set audio playback of 8-bit and 16-bit audio, mono or stereo, with optional conversion if the format is not supported by the hardware.

  • Audio runs independently in a separate thread, filled via a user callback mechanism.

  • Designed for custom software audio mixers, but the example archive contains a complete audio/music output library.

Also Complete CD-ROM audio control API.

What platforms does it run on?

  • Linux

  • Win32

  • BeOS

  • MacOS, MacOS X

There are also ports to Solaris, IRIX, FreeBSD, QNX, OSF/True64, in varying stages of completion.

Linux

  • Uses X11 for video display, taking advantage of XFree86 DGA extensions and new MTRR acceleration for fullscreen display.

  • Uses the OSS API for sound.

  • Threads are implemented using either the clone() system call and SysV IPC, or glibc-2.1 pthreads.

Win32

  • Two versions, one safe for all systems based on Win32 APIs, and one with higher performance, based on DirectX APIs.

  • Safe version uses GDI for video display. High performance version uses DirectDraw for video display, taking advantage of hardware acceleration if available.

  • Safe version uses waveOut APIs for sound. High performace version uses DirectSound for audio playback.

BeOS

  • BDirectWindow is used for video display.

  • BSoundPlayer API is used for sound.

MacOS, MacOS X

  • Carbon and DrawSprockets are used for video display.

  • SoundManager API is used for sound.

  • Native pre-emptive thread support on MacOS X

OpenGL for Linux

Sources of OpenGL for Linux

  • Vendors (nVidia, E&S, 3DLabs, FireGL, HP)

  • Commercial (Xi Graphics, Metrolink)

  • Open Source (Mesa, Utah GLX, DRI)

OpenGL Implementations: Mesa

Mesa

  • An open/free implementation of the OpenGL API

  • Available for five years

  • Well established as the "OpenGL solution" for systems with no official OpenGL support otherwise

  • Modular and portable

  • Good conformance

  • Good performance

  • Originally only a software rendering library. Now being used for hardware acceleration

Mesa Software Rendering

  • X Window System

  • GGI (Linux)

  • SVGAlib (Linux)

  • BeOS

  • MGL

  • OpenStep

  • MacOS

  • MS Windows

  • OS-independent off-screen rendering

Mesa Software Rendering for X

  • The original Linux OpenGL solution

  • Entirely client-side; built on Xlib

  • Allows rendering in almost all display modes (monochrome to truecolor)

  • Remote display to any X server (doesn't need GLX)

  • Full featured, many OpenGL extensions

  • Slow rasterization

Mesa Hardware Acceleration

  • Rasterization Interface

  • Mesa + Glide

  • Utah GLX

  • DRI

Mesa + Glide

  • Original hardware acceleration for Linux

  • Mesa uses Glide for fast rasterization

  • Designed for single-context, full-screen apps

  • Simplistic hack allows for rendering into an X window (pixel copy)

OpenGL Implementations: Utah GLX

Mesa + Utah GLX

  • Based upon an open-source implementation of the GLX protocol

  • Original by Steven Parker of the U of Utah

  • Uses XFree86 3.3.6 and Mesa 3.2

  • Fairly broad hardware support

  • Goal was/is to merge drivers into DRI/XFree86 4.0

  • Open-source

  • URL: http://utah-glx.sourceforge.net

Utah-GLX Hardware Support

  • Matrox G200 and G400

  • ATI Rage Pro

  • Intel i810

  • NVIDIA Riva, TNT, GeForce

  • SiS 6326

  • S3 ViRGE

Utah-GLX

Pros:

  • Don't need to recompile X server

  • Simple setup (glx.so X server extension, libGL.so library)

  • Simple driver development environment

  • Useful performance level and feature set

Cons:

  • Very limited direct rendering support

  • Limited to XFree86 3.3.x and Mesa 3.2

  • No official release at this time

OpenGL Implementations: DRI

Direct Rendering Infrastructure (DRI)

  • An architecture for direct 3D graphics hardware support with XFree86 4.0 and Linux

  • Open source

  • Included in XFree86

  • URL http://dri.sourceforge.net

Indirect versus Direct Rendering

Indirect Rendering:

  • Packets are sent to the X server for processing

  • X server accesses graphics hardware

Direct Rendering:

  • Client applications access graphics hardware

  • Must cooperate among multiple applications

Goals of the DRI

  • High performance - maximize potential of hardware

  • Flexibility - for a variety of hardware designs

  • Window multiplexing - multiple 3D windows

  • Portability - to other OSes and architectures

  • Secure - prevent malicious misuse

  • Robustness - don't crash or deadlock the system

  • Open-source - obvious benefits

Myths about the DRI

  • Drivers have to use Mesa: FALSE! Any OpenGL implementation could be used

  • Drivers have to be open-source: FALSE! It has a loadable module system

  • DRI development is closed: FALSE! Everything is done on the net

  • DRI doesn't support XXX: FALSE! DRI is an architecture that can be expanded

Architecture of a DRI Driver

  • 2D Driver

  • 3D Driver

  • Kernel Module

  • libGL

  • XFree86-DRI extension

  • GLX extension

DRI Hardware Support in XFree86 4.0 (and later)

  • 3dfx Voodoo3, Voodoo4, and Voodoo5

  • Matrox G200 and G400

  • ATI Rage 128, Radeon

  • Intel i810 and i815

  • 3Dlabs Oxygen

  • Sun Creator/Creator3D

Voodoo3 / 16-bit RGB, 16-bit Z buffer

  • Multitexture, paletted texture

  • Texture LOD bias extension

  • Voodoo5 - 32-bit RGBA, 8-bit stencil, 24-bit Z buffer

  • Hardware stencil operations

  • 2Kx2K textures, texture compression

  • Very high fillrate

Matrox G200, G400

  • G200 - single texture unit

  • G400 - multitexture

  • 16-bit RGB, 16-bit Z buffer

  • Software stencil and accumulation

ATI Rage 128

  • 16 and 24-bit RGB

  • 16, 24, 32-bit Z

  • Multitexture

  • Software stencil and accumulation

ATI Radeon

  • 16 and 24-bit RGB

  • 16 and 24-bit Z

  • Multitexture

  • TCL support under development

Intel i810

  • Inexpensive graphics integrated into motherboard chipset

  • 16-bit RGB and 16-bit Z

  • Software stencil and accumulation

  • Single texture unit

3Dlabs Oxygen

  • An early, experimental driver

  • Hardware transform and lighting

  • 32-bit RGB, 24-bit Z buffer

  • Single texture unit

  • Poor fill rate

Future DRI Work

  • The DRI is continuing to be enhanced

  • TCL Support in Mesa 3.5 and drivers

  • Better X/3D Memory Management

  • Multihead

  • OpenGL Conformance

Integrating X: GLX, Qt, Gtk+, Toolkits

Basic Steps for OpenGL Widgets

  • Instantiate the drawing area

  • Setup callbacks for initialization, resize, redraw, etc.

GLX

  • Choosing visuals - glXChooseVisual

  • Create a Window - XCreateWindow

  • Create the Context - glXCreateContext

  • Make it Active - glXMakeCurrent

  • Handling Events (Resize, Expose)

OpenGL with Qt

  • QGLFormat - like a GLX Visual, describes the framebuffer

  • QGLContext - GL context which may be bound to QGLWidgets

  • QGLWidget - the drawing area

Qt: Simple usage

  • Create a new class derived from the QGLWidget class

  • Implement the initializeGL(), resizeGL(), paintGL() methods

  • Instantiate the new class within your UI

Qt: Advanced usage

  • Use QGLFormat class to specify frame buffer attributes

  • Create one or more QGLContexts

  • Create one or more QGLWidgets

  • Explicity manage binding of contexts to widgets yourself

OpenGL with GTK+

  • GtkGLArea - C bindings

  • GtkGLArea-- - C++ bindings (wraps GtkGLArea)

Example GtkGLArea usage

  • 1. Setup attribute list. This is very much like the GLX interface:

              const int attribs[] = {GDK_GL_RGBA,
      GDK_GL_RED_SIZE, 1,
      GDK_GL_GREEN_SIZE, 1,
      GDK_GL_BLUE_SIZE, 1,
      GDK_GL_DOUBLEBUFFER,
      GDK_GL_DEPTH_SIZE, 1,
      GDK_GL_NONE };
           

  • 2. Check if GL supported:

      if (!gdk_gl_query()) printf("GL not supported\n");       
           

  • 3. Create the GL widget

      GtkWidget *glwidget = GTK_WIDGET(gtk_gl_area_new(attribs));    
           

  • 4. Setup widget's signal handlers (callbacks for Redraw, Resize etc.)

  • 5. Example redraw function:

      gint redraw(GtkWidget *w, GdkEventExpose *event)
      {
        if (gtk_gl_area_make_current(GTK_GL_AREA(w))) {
          glBegin(GL_TRIANGLES);
          ...
          glEnd();
          gtk_gl_area_swapbuffers(GTK_GL_AREA(w));
        }
        return TRUE;
      }
           

Toolkits

  • GLUT, PLIB, SDL

  • OpenInventor

  • OpenRM

  • Performer

  • OpenGVS

  • CrystalSpace (engine)

Benefits of Open Source

  • Broad testing coverage

  • Quick bug identification and repair (in many cases)

  • Anyone can develop new drivers or new OS support

  • Close user/developer community

  • Shared driver codebase: bug fixes, optimizations, new features are of benefit to all

Conclusion

  • DRI facilitates 3D hardware on Linux

  • Good hardware support and will just get better

  • Open source improves quality and acceptance

  • Hardware vendors decided to bet on Linux

  • IHVs and ISVs funded the development

  • Let IHVs know that you use their hardware on Linux

  • Let IHVs know you'll buy new products if they support Linux

References

Web Sites:

  • http://www.mesa3d.org - The home of Mesa

  • http://utah-glx.sourceforge.net - The development center for Utah-GLX

  • http://dri.sourceforge.net - The development center for the DRI

  • http://www.linux3d.org - My website with 3D references

Spatialized Audio with OpenAL

Overview

What is the OpenAL Audio System?

OpenAL (for "Open Audio Library") is a software interface to audio hardware. The interface consists of a number of functions that allow a programmer to specify the objects and operations in producing high-quality audio output, specifically multichannel output of 3D arrangements of sound sources around a listener.

The OpenAL API is designed to be cross-platform and easy to use. It resembles the OpenGL API in coding style and conventions. OpenAL uses a syntax resembling that of OpenGL where applicable.

OpenAL is foremost a means to generate audio in a simulated three-dimensional space. Consequently, legacy audio concepts such as panning and left/right channels are not directly supported. OpenAL does include extensions compatible with the IA-SIG 3D Level 1 and Level 2 rendering guidelines to handle sound-source directivity and distance-related attenuation and Doppler effects, as well as environmental effects such as reflection, obstruction, transmission, reverberation.

Like OpenGL, the OpenAL core API has no notion of an explicit rendering context, and operates on an implied current OpenAL Context. Unlike the OpenGL specification the OpenAL specification includes both the core API (the actual OpenAL API) and the operating system bindings of the ALC API (the "Audio Library Context"). Unlike OpenGL's GLX, WGL and other OS-specific bindings, the ALC API is portable across platforms as well.

Programmer's View of OpenAL

To the programmer, OpenAL is a set of commands that allow the specification of sound sources and a listener in three dimensions, combined with commands that control how these sound sources are rendered into the output buffer. The effect of OpenAL commands is not guaranteed to be immediate, as there are latencies depending on the implementation, but ideally such latency should not be noticeable to the user.

A typical program that uses OpenAL begins with calls to open a sound device which is used to process output and play it on attached hardware (e.g. speakers or headphones). Then, calls are made to allocate an AL context and associate it with the device. Once an AL context is allocated, the programmer is free to issue AL commands. Some calls are used to render Sources (point and directional Sources, looping or not), while others affect the rendering of these Sources including how they are attenuated by distance and relative orientation.

Implementor's View of OpenAL

To the implementor, OpenAL is a set of commands that affect the operation of CPU and sound hardware. If the hardware consists only of an addressable output buffer, then OpenAL must be implemented almost entirely on the host CPU. In some cases audio hardware provides DSP-based and other acceleration in various degress. The OpenAL implementor's task is to provide the CPU software interface while dividing the work for each AL command between the CPU and the audio hardware. This division should be tailored to the available audio hardware to obtain optimum performance in carrying out AL calls.

OpenAL maintains a considerable amount of state information. This state controls how the Sources are rendered into the output buffer. Some of this state is directly available to the user: he or she can make calls to obtain its value. Some of it, however, is visible only by the effect it has on what is rendered. One of the main goals of the OpenAL specification is to make OpenAL state information explicit, to eludicate how it changes, and to indicate what its effects are.

Our View

We view OpenAL as a state machine that controls a multichannel processing system to synthesize a digital stream, passing sample data through a chain of parametrized digital audio signal processing operations. This model should engender a specification that satisfies the needs of both programmers and implementors. It does not, however, necessarily provide a model for implementation. Any conformant implementation must produce results conforming to those produced by the specified methods, but there may be ways to carry out a particular computation that are more efficient than the one specified.

A Quick Tour

OpenAL provides two separate API's: the ALC API that is used for management of devices and creation/destruction and manipulation of AL contexts, and the core AL API which operates implicitely on the current AL context only.

OpenAL, like OpenGL, is not an explicitely object-oriented API and does not expose classes or other structured data types. However, like OpenGL (textures, display lists, vertex arrays), OpenAL has several classes of clearly defined objects that encapsulate some of the OpenAL state. OpenAL objects are the Listener, Source, and Buffers.

Listener

In many ways, the listener is a proxy for the AL context. While contexts are controlled using ALC, the context-specific AL state that is managed through the core API. Some of that state (state that is potentially expensive to change, and unlikely to do so frequently) is set globally, but other parameters are set with respect to the listener (e.g. position, orientation, volume/gain). The listener's placement and orientation within the world coordinate system determines where sources are located relative to the listener.

Sources

Sources in OpenAL are directional (but not oriented) cone emitters with a given position and velocity. Active (PLAYING) sources require an output channel (either as part of software mixing, or by allocating hardware resources). Sources bundle attributes that are applied during the processing, causing the provided sound samples to be modified accordingly before being sent to output devices. Sources are located in world space, and are meant to be used for generating spatialized sound. Sources that are meant to be omnidirectional have to have their attributes adjusted accordingly. In this as in other cases, the driver/implementation is expected to omptimize.

Buffers

Buffers are objects used to store sample data, and attributes related to that sample data. Buffers require memory resources and possibly (depending on the format provided vs. the data format used internally) conversion and decoding. Some compressed formats are supported. Buffers for spatialized sound (i.e. rendered by playing them using a Source) are treated as mono sound (conversion from multichannel data will occur if necessary). It is recommended to provide mono data.

While any given buffer is not directly accessible to the application, and no guarantees are made with respect to preserving bit resolution or sample frequency (all of which depend on the driver's implementation and the configuration of the device/context), buffers can be arranged in queues for each source. This allows streaming w/o use of callbacks, as well as a limited amount of runtime sequencing, conditional branching of sequences, or loop points.

In future revisions, buffers will also be used to provide read access to the internal mixing buffer, and to implement capture (microphones). Buffers can be shared among contexts, but their lifetime is connected to that of the context in which they were created. It is possible that multichannel (stereo, MP3, Dolby) data will be handled using a different class of buffers, and that these multichannel buffers will not be accessible to sources.

Comparison to DS3D and I3DL2

OpenAL in the proposed version 1.0 implements most of the DirectSound3D functionality (distance attenuation, Doppler effect), but deviates in some details (no MUTE or CLAMP MAX_DISTANCE, a reference distance instead of MIN_DISTANCE, Doppler effects scaled by specifying the reference velocity and a pitch scaling factor). Reverbration and other extensions defined in I3DL2 (or as implemented in EAX) are not yet part of the specification, but are supposed to be supported by extensions. No geometry based processing is included or planned at this time (as opposed to Aureal's A3D).

Implementations and Drivers

OpenAL is a cooperative effort between Loki Software, Inc., and Creative Technology, Ltd. (Creative Labs). Loki is providing and maintaing a software reference implementation as open source, primarily aimed at Linux, while Creative has been working on drivers that employ the SoundBlaster line of products.

The draft of the proposed 1.0 specification (finalized in October 2000), additional documentation and references, source and FTP archives, and further information is available at the official OpenAL website, www.openal.org.

Status and Outlook

OpenAL is still in its infancy. Certain areas of the API (most notably handling of multichannel data) have to be reworked, other areas (e.g. reverbration) have to be added. Release of feature complete, robust drivers supported by common hardware is the next milestone for the project.

References

Online Version of Tutorial

The full set of tutorial slides and the final version of the supplementing documentation will be available online. While the authors will eventually maintain separate locations for their presentations, the LinuxGames web site has volunteered to offer a single mirror location for the duration of GDC and afterwards. The electronic materials will be available at http://www.linuxgames.com/gdc2001/ beginning March 20th.

Credits

The material presented here and in the final version of the presentations has been collected in collaboration with a large number of invididuals, both within our respective companies, and outside of them. In particular, we acknowledge the tradition of knowledge sharing and free exchange of information in the free software and open source communities, without which Linux as a development platform would be much less transparent, and would in fact not even exist.