Book Excerpt: Audio For Games: Planning, Process, and Production - Integration DPM
May 11, 2005
I explained the power of a development process map (DPM) in Chapter 1. In this chapter I approach the DPM from the perspective of integration: the nittygritty of how sound, music, and voice are hooked into the game. Integration is important because actually writing music means nothing if the music cuts off when the player enters a conversation or if the music doesn’t loop properly. Integration can also make the music play adaptively—that is, change according to player actions. Let’s explore the fundamentals. Take a look at the integration DPM (Figure 8.1).
Your first decision must be how to organize all the files in the game. Each file represents an asset, whether a sound effect, music, or a voice-over line. The most important consideration in organizing files is the platform on which you will be developing, because different platforms handle files in different ways. First, however, let’s look at the ground rules that apply to all platforms:
- Each sound file is like an audio CD. You typically store CDs somewhere different from where you play them. Sound files are stored with the rest of the game’s data—usually on a hard drive, a CD, or a DVD.
- For the sound files to play, they must be moved out of storage to a location that is able to play them—the equivalent of the audio device on which you play your CDs.
- The fastest way to play sound files is usually to access them from memory: Random access memory can have sounds placed in it and removed; readonly memory cannot be changed once things are placed in it. The entire file is copied to memory from the storage area, then activated.
- A slower way to play files is streaming. Streaming takes data from storage and copies small chunks of a file into RAM one chunk at a time. When the game code triggers a chunk of data to play, that data is removed from RAM and the next chunk is lined up behind it to play immediately afterward. Understandably, this process is slower. Imagine if the first 30 seconds of a song were on one CD and you had to swap it out for a second disc when you wanted to hear the next 30 seconds. The process of playing a streamed file isn’t this cumbersome, but it does take longer to initially load the file for playback.
|Figure 8.1 This integration DPM shows all aspects of how sound, music, and voice are hooked into the game.|
The two most effective ways of organizing sounds based on platform are sound banks and metafiles.
As of this writing, the platform that has been around the longest and is still in current development is the PlayStation 2. Compared with the Xbox and GameCube consoles, it has the smallest amount of RAM and the least processing power that can be devoted to audio. As I explain in the next section, it is necessary to use sound banks with this platform as well as with platforms such as the Nintendo GameCube and on portable devices such as the Game Boy Advance.
Why continue to develop on the PS2 if it lags so far behind? There are more than 60 million PS2 units in the United States, and the platform has a large market share. Also, just because a platform isn’t on the cutting edge of technology doesn’t mean great games can’t be developed for it. Besides, the next revision of gaming hardware is always around the corner—the PlayStation 3 isn’t far off.
Sound Bank Methodology
The way you organize a game’s files depends primarily on the platform on which you’re developing. For the PlayStation 2, a sound designer or composer saves the files in sound banks before the sound engine code loads them into the game’s memory. Sound banks are groups of sounds that you can organize however you want. I’ll get into the details of sound objects, or “containers,” in the next section, but for now think of a sound bank as a box in which you organize sounds. Why use banks in the first place? Why not allow access to all the files at once? Because all the files together would take up too much space to be stored in the PlayStation 2’s memory. To make things easier, banks are used to load into RAM only the files that need to make sounds at the point in the game the player is actually playing. Sounds that don’t occur until later in the game are kept in storage (on a hard drive for the Xbox, a CD for a PC. and a DVD for all platforms except the GameCube).
Before you bank sounds, you should know the requirements of your platform. On the PlayStation 2, sound data is limited to about 2 MB of RAM. This means that no matter what, no more than 2 MB of sound can be played back at the same time. This is fairly constricting, especially for the parts of the game where you want a lot of sounds to play at once.
Once you have an idea of what sounds will be required, you can establish how many sounds will play back at once by reading the game’s design document. This document explains what sounds are needed and when, and you use it to fill out your sound asset spreadsheet. You can also use the asset list to specify which sounds go into which banks. For example, suppose that when a player in the game uses a particular weapon, the game engine code swaps the sounds the weapon makes with those of another weapon (each weapon sound set being represented in a bank). The audio engineer can glean this from the design doc when he reads “Player 1 can switch between weapons.” You’ll use this kind of information as a basis for organizing banks.
Next, ask the lead programmer what bank size would work best for the game. This is an important task that can be included in the “Define platforms” section of the audio DPM, under “Buffer size/priority scheme.” A buffer is the amount of memory needed to load a bank of sound effects. This will vary based on what the sound engine is doing, such as streaming or prioritizing (both of which I’ll cover shortly). Most of the time programmers are comfortable with a bank size much smaller than 2 MB. This is because a bank represents sounds loaded according to priority. For example, if a player is using one weapon at a time in an action game and all the weapon sounds are in one bank, the unused weapon sounds are taking up memory that could be used for other things such as environmental sound effects or voice. Using smaller banks gives greater control because it is a more efficient use of RAM, and it lets you prioritize groups of sounds at different points in the game according to the importance of playback. For example, sounds like voice are more important than distant explosions, and a bank system can reflect this priority. Then, you might ask, why not just have one bank for each sound and prioritize sounds individually? The answer to this is CPU processing time: It would be too costly for the CPU to prioritize each sound in and out of memory. Each file takes a bit of processing power to load; when many are loading at once, processing power is eaten up quickly, leaving little room for other components such as graphics.
Here's an example of a sound bank structure for a PlayStation 2 football game (Figure 8.2). Note that the compressed files are about one-fourth the size of the uncompressed files.
|Figure 8.2 This sound bank structure for a PlayStation 2 football game shows how banks are constructed and what the relative file sizes are.|
In the case of a PlayStation 2’s sound memory structure, sound data is usually compressed to one-quarter of its original size using a format called VAG. You can use Sony’s proprietary program for developers, VagEdit, to convert your WAV files into VAG files. This is a great way to combat that 2 MB limitation for memory space on the PlayStation 2. Let’s take this a step further and add to the bank structure that list of priorities we talked about earlier (Figure 8.3).
|Figure 8.3 This sound bank structure shows bank priorities added.|
As you can see, the “Bank priority” column tells the programmer that the banks can be loaded at certain points in the game. Having the priority listed in the bank structure list (which can double as an asset list) ensures that the sound team and programming team are aware of what sounds will be played and when.
Now we have our sound asset list, our bank structure, and our priorities. What are we missing? Remember how all platforms provide the option to stream as well as to use memory? Adding streaming as an option is the final step in prioritizing our files. Take a look at the finishing touch, the Streamed check box, in our bank management spreadsheet (Figure 8.4). Also note that if streaming isn’t chosen, memory must be used to load the sound. These two methods are important to prioritize with the following information in mind: It takes longer to load files that are streamed from something like a DVD than it does to load files that are in RAM. The length of time it takes to load a file to play is called latency (as you may remember from the DPM). For example, music files don’t usually need to be triggered as instantly as does a weapon sound that needs to be linked to a visual event like the firing of a gun. To keep this latency down, weapon sounds are prioritized to RAM, and music is prioritized to streaming.
|Figure 8.4 The sound bank structure has a streaming option added as a check box.|
Now we're ready for integration.
--Excerpted from pp. pages 167-172 of Audio for Games: Planning, Process, and
Production (ISBN 0-7357-1413-4) by Alexander Brandon. Copyright © 2005. Used
with the permission of Pearson Education, Inc. and New Riders.