With Music In Mind: A Guide to Adaptive Audio for Game Designers
By Guy Whitmore
May 29, 2003
this article, I'm going to explain ways that game designers can work more closely
with composers to achieve a more integrated soundtrack for games. This is important
because music is currently underutilized in most games, allowing plenty of room
for design innovation which translates to more game sales, while delivering
a more meaningful player experience. The following sections will show you, the
game designer, how to incorporate music into your game design from the onset
of a project, and how to follow through from initial concept to final implementation,
leading to a highly immersive game score.
"Adaptive audio" is a term used to describe audio and music that reacts appropriately to - and even anticipates - gameplay. The term "interactive audio" is also used in this way, but it is used more broadly, and it has been misused and overused.
An analogy I like to use when describing the benefit of adaptive music relates to game graphics. In a sense, linear music is to pre-rendered animation as adaptive music is to real-time 3D graphics. What did games gain from game-rendered art assets? The ability to view objects from any side or distance, and the flexibility to create a truly interactive game environment. These graphical advances give gamers a more immersive and controllable environment, and adaptive music offers similar benefits. Currently most game music is "pre-rendered" - mixed in fairly large sections prior to being put in a game. In contrast, adaptive music is "game-rendered" - musical components are assembled by the game as it is played. This flexibility allows adaptive music to sync up with the game engine and become more integral to the action on screen.
The Spectrum of Adaptability
Many degrees of adaptability exist in music. On one end of the spectrum is linear pre-rendered music, and on the other is music that is completely game-rendered. Where on the spectrum a score lies depends on the game at hand, and aesthetic decisions made by the composer and game designer. There are many options for combining small pre-rendered assets (such as wave files) with assets that tend to be more flexible (such as MIDI files). Even within a particular game, different degrees of adaptability are called for, such as linear cut-scenes versus the ever-changing game environment.
No longer are there excuses for using detached, linear game scores. Even licensed music can be arranged to function adaptively. Orchestras can be recorded in a manner that permits adaptive arrangements. If a composer simply supplies long, linear musical pieces for a game, that composer is not "scoring" the game; they are just providing music that is in the correct genre and style. Imagine if a film composer did the same thing - created music that had nothing to do with the images and action on screen. That composer would be fired! Scoring to picture is half the art in film composing, and the same applies to game scores. The difference is that scoring games is an emerging art and is a new skill for most composers, and in many ways is a more complex task due to its nonlinear nature.
Integrating Music Into Your Game Design
The time for a game designer to begin thinking about game music is the moment the initial design process begins. As you begin envisioning the style of gameplay and the environments in which that game will take place, think about the aural aspects of the game, too.
Music dramatically affects the tempo and pacing of gameplay. Game designers who are aware of this can use music to their advantage in moving the story and heightening the emotional impact. Music that is not directly connected to gameplay elements can distract the player and literally pull them out of the environment. This is due to a sense of "disconnect" that the player feels when the music is inappropriate to the scene. This is true of all genres of games, whether they are action/adventure games, real-time strategy games, puzzle games or another genre. Even when music is not a core aspect of the gameplay, it needs to be tied to that core and support it.
When a game designer starts to think about the look of a game, ideas about style, color, and lighting arise. The game designer imagines what the game world might look like, and the moods that the visual landscapes will inspire. The same attention should be given to the audio landscape of the game. Audio affects the mood of any game in both profound and subtle ways, and it is often the subtle nuances that make or break a game scores' effectiveness. The more clarity of intention a designer brings to bear on the sonic nature of the game, the more likely the score will support the game overall. A theater director once told me to treat the music as if it were another character on stage, adding its own personality to the performance, interacting with the other performers. This director understood the power of aural landscapes, and used them to propel his performances.
Another important aspect of understanding how audio affects an audience is to know that it often is working on a subconscious level. The visual aspect is often foremost in our mind. This does not mean that audio is not enhancing the player experience. On the contrary, audio has the opportunity to affect the experience in a stealth-like manner, and to affect how a user interprets the gaming experience. That is a powerful tool, and it is why focus groups inadequately measure the importance of game audio. Just as good audio can enhance a player's overall perception of the game, poor audio will drag that perception down.
An effective aural landscape enhances gameplay by reinforcing the overall mood and ambience of the game and accents important gameplay events. Music drives the pace and momentum of gameplay and supports the visual aspect of the game by complementing the art direction. Subtle mood shifts in the score flow with the storyline as dramatic shifts in intensity follow the action of the gameplay. Finally, seamless music transitions connect the various moods and intensities, thus supporting the game's continuity while keeping the player "in the game".
The Music Design Document
Creative collaboration between the game designer and game composer is too often neglected. A good composer brings more to the table than composition chops. That individual brings ideas about how the score can best support the game, stylistic ideas, dramatic techniques, and is aware of how, when, and where music is effective, and why. The game designer often has a broad vision for the music, and the composer can focus that vision, and find specific solutions to creative and technical issues that arise. The collaboration between the designer and composer can inspire both people, and often results in a creative feedback loop. The collaboration also ensures that the game score is relevant to the game as a whole.
Creating a "music design document" is one way of turning the creative vision into a technical solution. A music design document begins by imagining how you would like the music to behave in the context of your game, and asking yourself questions like these:
These are the types of questions that will lead to an effective music design.
The music design document can be part of the game design document, and like the game design document, it should evolve as your vision of the game solidifies. The music design document should guide the process, and codify the ideas of your creative vision. Collaborating with a composer on the music design is as important as collaborating with an art director on the visual direction of the game. The major sections of a music design document should include these headings:
Adaptive Audio Technologies
Because adaptive music is at a relatively early stage in its evolution, there aren't many ready-made solutions for creating an immersive game score. Thus far, many companies are blazing their own trails by building proprietary interactive music engines and defining functionality based on the needs of specific games. Even off-the-shelf solutions such as DirectMusic need careful integration in order to function well. Therefore it's important to begin thinking about technical solution to your music needs at the onset of your project.
There are many approaches to adaptive audio, and many more that haven't been thought of yet. Any of these ideas and approaches can be used individually or combined to suit the needs of the project at hand. Before a specific music system and engine can be chosen or created for your game, the needs of that technology must be decided upon. Here are some adaptive audio solutions that are already available:
Wave files. The choice between wave and MIDI files used to be an either/or affair, but now games can use the best of both these worlds. Pre-mixed wave files, despite their linear nature, offer high production values due to the fact that any compositional technique, tool, or instrument, can be captured and mixed into a wave file using state-of-the-art equipment. Additionally, wave files can be arranged into small flexible wavelets, boosting their adaptive potential.
TRON 2.0 (PC-Monolith/Disney) and No One Lives Forever 2 (PC-Monolith/Fox-Sierra), used wave technology in a highly adaptive manner. Each music state was divided into individual wave files of 1 to 4 measures in length, allowing seamless transitions between music states. The beginning of each wave file represents a transition point where the music can move to the next music state. The short waves overlap each other, or dovetail, creating a natural release and reverb decay during the state transitions.
MIDI files. MIDI files offer even more flexibility than wave files because the data is more granular. Each musical note is a piece of data that can be manipulated or shaped to suit the score. The tempo, harmonic content, and orchestration, can be altered quickly and transitions between sections are easier to create than with wave files. This places MIDI further down the spectrum of adaptability than wave files. In years past, it was difficult to create high-quality MIDI scores. Since then, we've learned that creating custom instrument banks tailored to the music at hand is the key. Each platform has a format for custom instrument banks (e.g., DLS for PC and Xbox, VAB files for Playstation, and so on). Creating and utilizing custom instruments is a specialized skill, and it has been done poorly more often than skillfully. This is partly why MIDI scores get a bad rap in the game industry. All the interactivity in the world is meaningless if the instrument quality is sub-par. When done well however, MIDI files with custom sounds offer the most dynamic music experience available with current technology.
As mentioned earlier, the choice doesn't have to be either waves on one hand or MIDI on the other. Wave and MIDI files can be combined to get the best of both worlds. One approach could be to use wave files as the main bed of music, and use MIDI files as musical accents and/or themes that layer over the wave files. This would let you take advantage of the quick response and harmonic flexibility of MIDI.
No One Lives Forever (PC-Monolith/Fox), and The Mark of Kri (PS2-Sony) are examples of highly adaptive scores using custom instrument banks and MIDI files.
Timing and synchronizing music cues. If your music design calls for music cues to be layered or sequential, the timing of the performances may be critical. Having the ability to start a cue on a specific rhythmic boundary of the underlying music will help the score sound musically coherent. Various cues can then be made part of a single piece of music rather than having disparate cues randomly collide with each other.
Transitions between cues. Moving smoothly from one musical cue to another in a nonlinear game environment is tricky, but necessary to maintain the thread of gameplay. Transitions are the glue that add continuity to a game score, and hence to the game itself.
The timing and syncing issues just described apply here as well. Highly adaptive scores use a variety of different transition types depending on the specific scenario of the game. A transition could be silence between cues (one cue ends then the next begins), a cross-fade between cues, a direct splice between cues, a synched overlapping cue, or seamless transition music between cues. Games that are limited to the first two transition types will likely end up with a disjointed score, because the game is constantly fading music in and out; musical momentum is never maintained. A game score that uses exclusively linear music files with cross-fades is analogous to a modern game using Myst's (1993) visual technology! The result is a musical slide show. Cross fades have their place, but other options are available to bridge the gap between primary music cues.
But creating seamless transitions between cues is no easy feat compositionally or technically, because the timing of a transition isn't known until run-time. So the music (and the music system) must be prepared to transition to the next cue at any moment, and it music do so in a musically satisfying manor. This is why transition boundaries and timing are so crucial.
Techniques for achieving seamless transitions include the following:
These approaches to music transitions are just some of the possibilities, and many more will be tried in years to come. Transition techniques can be combined and applied to MIDI and/or Wave data. Musical style and direction will dictate which approach will work best for your game -- layering may work well for a techno score since that is often the approach DJs use to create their music, while a transition matrix works well for an orchestral score, allowing for sweeping, dramatic transitions)
music clip moves through all six of the music states for the Ambush theme,
demonstrating transitions called from a transition matrix.
This music clip demonstrates direct cue to cue transitions, some of which have the effect of instrument layering. Each cue increment represents a puzzle row being cleared by the player.
Harmonic systems. Altering the harmonic and chordal characteristics of a game score in real time via the game engine is largely uncharted territory. Systems exist to accomplish this, but few have attempted this with any depth. However, applying a few simple harmonic techniques can go a long way. One such technique is simply changing the key of a music cue to add variation or a change of mood (This is one of the advantages of MIDI-type music files over waves.) Another is to track the harmonic changes of a music cue so that layered cues will play 'in harmony' with the primary cue.
Run-time mixing and effects. How a piece of music is mixed dramatically affects the music's impact. The mood of pre-mixed music is set in stone, but highly adaptive music systems can alter the volume and panning of individual instruments, changing the character of a music cue. One example is to raise the volume of the percussion instruments to add punch to a mix as the drama on screen amps up. Run-time effects such as reverb, delay, chorus, etc. may also be controlled by the music system. Adding delay to a drum groove will change its rhythmic feel. Or imagine fading down music a cue, while bringing up the reverb, creating a distant ambient effect. Even if run-time effects are not used adaptively, they can help blur the seams between musical cues. The visual analogy is run-time lighting or shadowing effects. Those types of detail make an environment seam more alive.
Musical variation. Variation within a game score adds replay value to the music and the game. Many possibilities exist with regard to music variation. At a high level, when a music cue is called by the game engine, the music system could randomly pick between several wave files. This type of variability prevents specific pieces of music from repeating ad nauseam (e.g., "There's that battle music again!"). At a lower level, you could vary musical cues at an instrument level. Each instrument could have several possible parts, one of which is randomly chosen at run-time. This gives a piece of music an organic, non-looping feel, and allows the music to play for longer periods of time without feeling repetitive.
The Music System
Once the music functionality and type of technology are decided upon, it's time to make decisions about the music system and engine. The first decision is whether to use an available engine or to create one from the ground up (or perhaps you're lucky enough to work for a company that has a proprietary system). The creative needs of the game score and the limitations of the target platforms must be kept in mind when researching these options. This is also an area where collaboration with an experienced composer can help. An "adaptive music arranger" can help decide what type of technology will work best for a project.
There are few 'off-the-shelf' solutions for adaptive audio. Microsoft's DirectMusic supplies a great depth of technology and interactive features. But with that depth comes a steep learning curve for composers hoping to take advantage of these features. An upcoming book on DirectMusic (edited by Todd Fay, published by Wordware) will assist composers and programmers in learning and implementing the technology. Other technologies, such as the Miles Sound System, supply an audio playback foundation (MIDI, Wave, MP3 playback) that adaptive functionality could be built upon.
Proprietary technology exists at companies such as Sony, Electronic Arts, and LucasArts. The advantage of proprietary technology is that it can be tailored to specific genres and games of the company, and because the company owns it, the source code and experienced programmers are on hand to customize the technology according to the needs of the composers and designers. The obvious disadvantage is that the technology is not usually available to those outside the company.
Adaptive audio technology is still in the nascent stages of development, despite years of work by dedicated composers and programmers. Many techniques have been explored in a wide variety of games, ranging from early arcade games to modern consoles. Yet, a coherent language describing the techniques and technology has yet to arise, causing developers to reinvent the wheel over and over again. This is beginning to change. The IXMF standard (see the article by Linda Law in Gamasutra's 2003 Audio Resource Guide) promises to create a common adaptive-audio language and an industry-wide technology standard available to anyone. Also, the Adaptive Audio Now working group of the IASIG (Interactive Audio Special Interest Group, www.iasig.org ), will provide a forum and community for composers, designers, and programmers to share ideas, tips, articles, and opinions, regarding adaptive audio.
Integrating the Music System with the Game Engine
Regardless of the music system developed or used, it is how that system integrates with the game engine that ultimately determines its effectiveness. The most advanced music system and adaptive score will fall flat if it is not communicating well with the game engine. Communication between game engine and music system is crucial, and that communication is a two-way street. Not only must the game engine send commands to the music system, but the music system must tell the game engine its status.
There are various ways that this system integration takes place. Often, more than one technique is combined -- several "virtual connectors" between the two are used. There are two sides to this system: 1) what aspects of the music will be altered or changed by the game, and 2) how will the game trigger these changes in the music. The most common technique for changing the musical score is the use of music cues, but there are other musical aspects that can be altered by the game such as the music's volume, instrumentation, harmony, audio effects, tempo, muting/unmuting instruments, and layering themes or accents.
On the other side, the game engine can use various triggers to change the music. These types of triggers include: location based triggers, game-state triggers, NPC AI triggers, player character triggers, and event triggers. Each has its own advantages and usefulness. In deciding what types of triggers to use, first consider the nature of your gameplay, and how you'd like to emphasize and support that gameplay with music. This inquiry will lead to logical choices for music integration. As with other aspects of the music system, these trigger types can be combined and customized for your game's needs. In fact, it will take some trial and error to hone in on best parameters for the triggers. Don't give up after the first test of music integration. Successful integration may take some iteration.
The Music Must Adapt, And So Must We
Creating an immersive game score takes a coordinated effort. You need:
But despite these seemingly large hurdles, creating an effective adaptive score is relatively inexpensive. And the payoff is a score that will actually be a part of the game, as opposed to one that players may mute.
Yes, this is largely uncharted territory. But isn't exploring that territory what innovative game design is all about? Just as the flexibility of 3D graphics swept the industry in the mid '90s, adaptive scoring techniques are beginning find their way into more and more games. As this happens, standards will arise, techniques will evolve and mature, and new technology will be created. Those designing games now have a chance to impact the direction and role that music will play in games.
Music has the potential to be more than window dressing for games. It can directly support gameplay and heighten emotion. With a healthy collaboration between game designer and composer, music can be an effective game design tool, helping to establish and reinforce the core game design. Active aural landscapes working in tandem with 3D visual environments can set the stage for truly immersive gameplay.
Copyright © 2003 CMP Media Inc. All rights reserved.