|
Features

Creating
an Interactive Audio Environment
Audio in
today's interactive entertainment media has progressed far beyond the
bleeps of early video games. An object or an environment within a game
exhibits a number of complex relationships. A creature may be surprised
to see you. A robot's gears get stuck when it tries to move toward you.
A diabolical enemy is afraid of the dark. When encountering these elements
in a game environment, we expect them to communicate to us through audio
in subtle and different ways. Aspects of emotion such as surprise, frustration,
admiration, and fear could easily be conveyed through an enhanced and
well thought-out object vocabulary.
Our lives are full of an ever-present collage of audio cues that we take
for granted. For example, at this "quiet moment," I can hear the cascading
sound of a fountain in a pond, the intermittent quacking of ducks and
geese, a baby in the background, someone pouring a bucket of water outside,
and a plane flying overhead. All of these cues, though subtle and seemingly
unimportant, create the ambience of a particular scene, imbuing it with
identity and significance. Without these background sounds, or ambiences,
our lives would sonically resemble a lunar landscape. A collection of
sound cues such as this within a game environment refers to the noncausal
relationship of a player to the game. The sound space isn't triggered
by the player's direct action. Instead, the sound is affected by and reacts
to the environmental aspects of the scene that is being conveyed.
When we go to a movie, our emotional response is directly related to the
music. The music swells, our anticipation grows, and our adrenaline rushes.
The music ebbs, and we feel a calming sensation. This is very easy to
convey in a linear medium, where the ending and the progression of events
in a movie is predetermined; but how do we compose a soundtrack to a game
if it can follow many paths and endings? An adaptive soundtrack that responds
well to game events is one of the best ways to envelop the player in a
game experience.
Audio
Object Vocabulary
An audio
object vocabulary is a method by which game objects (not necessarily just
speaking ones) talk to each other and the player. The methods of communication
vary from object to object and from context to context. There are three
types of object interaction: direct, indirect, and environmental.
Direct Communication
An object communicates directly as a cause of direct action on its part.
When the ball hits the paddle in the old arcade game Pong, it makes
a bleep. This is direct object interaction. Unfortunately, most games
haven't explored far beyond this simplistic level of object interaction.
Direct communication is important when you want to convey specific audio
cues, such as a scream of pain when you shoot a monster, or the creak
of a wooden rocker when you push back a rocking chair. In Monolith Production's
Claw, I found it important that every character had something different
to say when you interact with him or her (or it), even if it's in combat.
For example, a melodramatic character, while dying, would say "I'm dying...
I'm dying... I'm dying... I'm dead," with an animation to suit. A more
primitive character would emit a squawk, and a more substantial enemy
would yell out, "I curse you, Claw" as he falls to his death. When you
hit a lounge-lizard-turned palace-guard-merman, he would say, deadpan,
"Ouch that hurt quite a bit."
As always, a variety of audio cues are paramount in ensuring that a set
of quotes doesn't become repetitive. From a programming standpoint, that
may require a bit more intelligence to pick out the quotes. A buffer with
an index to the most recently used quotes helps a lot because it shields
the player from experiencing the same "random" set of sounds in rapid
succession.
Indirect Communication
This is an indirect method of object interaction. That is, by causing
something to happen in the game, something else responds sonically. A
typical example of this is a "sighting" state for an enemy. When an enemy
sees you, and his or her AI changes, a sonic cue that signifies that change
may be appropriate. In Monolith's Blood, for example, cultists
scream in a terrifying foreign language (created for the game drama) a
series of epithets when they spot the player (Figure 2). In Claw,
every enemy has something different to say in the "sighting" state. A
female boss taunts Claw in a mildly suggestive manner when they come into
contact. A goofy bear sailor exclaims "I don't like you" when he sees
a player.
Other sonic cues may convey indirect object interaction. Your character
may begin breathing heavily when he or she is tired (health is less than
some coefficient). Your metal body suit emits a rubbing, squeaky noise
that signifies rusting. In addition to sonic cues that help convey complex
visual phenomena, certain characters within the game display behaviors
that can be conveyed easily through sonic cues, even if they aren't represented
visually. Indirect cues can be based on a number of different motivating
factors, the rules of which can be determined at the game design stage.
For example, in Blizzard's Warcraft II, clicking on an ax-throwing
troll more than once causes it to respond with annoyance, even though
no animation is being shown. This is highly effective character enhancement.
Environmental Communication
A character or object in the game may generate a system of audio cues
on its own, irrespective of its communication to the player. This is purely
a function of a character's existence in its environment. It may be busy
chatting to itself or other characters. It may generate a sound or a series
of sounds on its own. Our goofy bear sailor from Claw will comment
on how hungry he is or where his pet rat might be when he's in an idle
state (Figure 3). Depending on where he is in the game, Caleb (the character
you play in Blood) may pick from a variety of different show tunes
to sing while he's taking a break from the carnage. A thespian tiger from
Claw recites different Shakespearean passages as he muses on his
own omnipotence.
Environmental communication need not be comic, nor does it need to be
vocal. A swishing blade and a humming motor sound signifies an industrial
fan in Blood, while a phone may be ringing intermittently. A character
may pass by an alien hive, with pods emitting a terrifying whine.
Environmental communication is paramount in reinforcing a character or
object's existence in the game environment. The character literally comes
alive as a personality or physical entity. But as with all different types
of object interaction, it's important to remember to keep a consistent
set of sounds from character to character. In Claw, I made a decision
to use three different idle cues (environmental communication), four different
sighting cues (indirect communication), and between eight and nine sounds
(direct communication) to describe each character sonically. In the end,
most characters used more and some less than that average. However, planning
the audio object vocabulary ahead of time helped to maximize the use of
memory allotted to sound in the game.
Character
Development
The nature
of a game object must be relayed in the character of its "voice." It's
very easy to screw up the integrity of a character by giving different
visual and aural personalities. However, giving the right "voice" can
greatly enhance a character's personality. A weak character may be depicted
through the use of a humorous voice. A stronger character's dramatic personae
can be highlighted through the use of a deeper and more resonant voice,
as well as a script that relates without question his or her authority.
TIPS AND TECHNIQUES.
- Always
use professional voice actors. Trained voice actors are professionals
who specialize in giving your character the voice it deserves. Whether
it is a cartoony or a deep resonant voice, a single talented actor may
help develop your ideas for multiple characters and realize them in
ways that you haven't conceived. When in doubt of how to find a good
voice actor, look to talent agencies and talent search services for
help. Moreover, making a trained voice work with the rest of your mix
is quite a bit easier than trying to amplify or equalize a weak voice.
All sound engineers can attest to this.
- Spend
a little more time in sound design. As in all cases, don't just pull
sound effects off of a CD. Create sound effects from your own sampled
sounds as much as possible. A portable DAT recorder and a good microphone
in the field will take you much further than a commercial CD sound library
ever could. Nothing kills a unique audio environment more quickly than
the phrase, "I've heard that somewhere else before...."
- Collaborate
with professional scriptwriters. Writers would jump at the opportunity
to write a couple of hundred lines of dialogue for some game characters.
The results will definitely be worth the investment.
- Don't
be afraid to inflate the vocabulary. Minimize silent time. If you have
the space for audio, use it. Set the limit with the programmers and
designers early as to your memory budget for audio, and use it wisely.
Ambient
Sound
Ambient
sound refers to the sound world that is generated from a player's location
in the game space. It is a system of indirect and environmental cues that
immerse a player in a particular setting. As in my real-world example,
we are surrounded by ambience all of our lives - a complex web of sound.
However, ambience is the most underdeveloped side of sound design in interactive
media. A game with little or no ambient sound presents little or no connection
to how we perceive the outside world with our ears. An ambient sound world
might be as simple as a single looping track of forest sounds or a system
of sound-producing objects all linked together by their location within
a given game environment.
The environment can communicate to the player information important for
the game-playing experience. For example, a raven flies by in a forest,
making a screeching sound that informs the player that he or she has ventured
too far. A swamp makes a menacing gurgling sound, informing the player
that he or she shouldn't go there. The sound of a portal opening and closing
in the distance informs the player that he or she is close to the level's
exit.
Environmental ambiences fully transport a player into the world presented
by the game. In Claw, each level has a distinct set of ambient
sounds based on the terrain that the main character is encountering. Within
a terrain, a single (environmental) looping sound is used (such as the
sound of a forest), along with a set of sounds (indirect cues) that are
triggered either by Claw's location on the map or by random chance. For
example, the sound of a character whistling in a window matches the animation
of the character shaving and the background ambience of village noise.
When Claw moves through another terrain, the looping ambient sounds would
cross-fade, and another set of ambient trigger sounds would be selected
that corresponds to the new terrain.
In Blood, I used ambience to enhance the atmosphere, as well as
to connote physical environments. In a temple, distant chanting is heard
(though the source of the chant is never discovered) (Figure 4). In a
narrow hallway, whispers surround the player from all sides. The inclusion
of atmospheric elements adds to the spooky and scary nature of the game's
look and feel.
Tips and Techniques
- Try
to use consistent reverb settings. All sounds within a given environment
should have a similar set of reverb settings that place the entire sound
world within a consistent acoustic space. There are foreground and background
elements that do stand out from within the ambience, but not so far
as to mistake these sound elements for characters or objects that a
player must encounter.
- Make
your loops seamless. The looping ambiences in the game need to be smooth
and unnoticeable. Large variations in pitch or amplitude will make the
loop quite recognizable and annoying after a while. A rhythmic pattern
works well (like the sound of crickets), if it's cut perfectly. Also,
a longer sound sample will help mask the loop point.
- Avoid
loops. Though seamless loops are not an impossibility, it's best to
use trigger ambiences whenever possible. Trigger ambiences help mask
the loop point, as well as provide overall variety in the ambience.
In Kesmai's Air Warrior, I used trigger ambiences to convey the
sound world of a World War II airfield. During any given time, an airplane
fly-by sound, a vehicle drive-by sound, and an airplane startup sound
would be selected and played from a set of 50 or so trigger ambiences.
Since these trigger ambiences were selected randomly and played at random
times, the sound world was always changing and seldom repetitive. Another
method of avoiding loops is to queue similar sounds one after another.
A set of three or four sounds that fit seamlessly end-to-end will work
well if they are selected to play on a single channel randomly. This
helps break up the pattern created by a single looping sound.
- Try
to create fine gradations of ambiences. Say we're walking from a forest
into a mountain pass. We start out in a deep forest then walk through
a leafy forest then into a meadow before reaching the mountain pass.
If we have a single sound for the forest ambience, no matter how the
forest changes, the ambience will remain the same until we change scenery
drastically when we reach the mountain pass. However, if we subdivide
the forest into three gradations (deep, leafy, meadow), we'd be better
able to convey to the listener the transition of environments from forest
to mountain pass.
Adaptive
Music
The nonlinear
medium of computer gaming can lead a player down an enormous number of
pathways to an enormous number of resolutions. From the standpoint of
music composition, this means that a single piece may resolve in one of
an enormous number of ways. Event-driven music engines (or adaptive audio
engines) allow music to change along with game state changes. Event-driven
music isn't composed for linear playback; instead, it's written in such
a way as to allow a certain music sequence (ranging in size from one note
to several minutes of music) to transition into one or more other music
sequences at any point in time. An event-driven music engine must contain
two essential components:
- Control
logic - a collection of commands and scripts that control the flow of
music depending on the game state.
- Segments
- audio segments that can be arranged horizontally or vertically according
to the control logic.
In Kesmai's
Multiplayer Battletech, control logic determined the selection
of segments within a game state and the selection of sets of segments
at game state changes. Thus, the control logic was able to construct melodies
and bass lines out of one to two measure segments following a musical
pattern. At game state changes, a transition segment was played, and a
whole different set of segments was selected. However, this transition
segment was played only after the current set of segments finished playing
so as not to interrupt the flow of the music. I selected game states and
also tracked game state changes based on the player's relative health
vs. the health of the opponent. Overall, I composed 220 one to two measure
segments that could all be arranged algorithmically by the control logic.
What resulted was a soundtrack that was closely coupled with the game-playing
experience.
Tips and Techniques
- Music
comes first. Remember that no matter how closely your music follows
the game play and how interactive it is, if it doesn't gel as a musical
composition, you're better off writing a linear score. Always explore
all possibilities of transitions from one game state to the next, and
see if the music reacts the way you meant it to react. Make sure that
you write transition sequences and that the engine is intelligent enough
not to change game states midmeasure or midphrase.
- Decouple
segments horizontally and vertically. Compose your music so that different
segments may be combined end-to-end (horizontally), as well as on top
of each other (vertically). This way, you can combine different melody
lines with bass lines, use different ornamentation, and so on.
- Don't
give away too much information. Sometimes a musical cue might say too
much, when it was meant just to highlight the game state change. For
example, in a certain game, an upward chord progression always signifies
to a player that a starship is on his tail. When working on game state
changes, make sure your event-driven music isn't used as an early warning
system for the game.
- Define
a series of translation tables to track game state changes. For example,
in Multiplayer Battletech, a game state change from "winning"
to "advantage" implies a losing trend. The music reacts to this change
by selecting a different set of segments than it would if the change
occurred from "advantage" to "winning." By composing in a nonlinear
fashion, and by having the music react to the player's actions directly
and indirectly, we introduce a new level of interactivity. Emotionally,
the soundtrack carries the person seamlessly along with the action in
much the same way as the static, linear media of film. In this fashion,
music becomes the gateway to the player's emotional response to the
game.
Total
Immersion through Sound
As game designers and audio producers, we should be constantly aware of
the impact that a well thought-out audio environment can have on the product.
It can make a graphically simple and uneventful scene become awe-inspiring.
Effective use of an audio object vocabulary can enhance the impact a character
may have on the game player. Ambient sounds, in all of their variety,
can transform a game scene from a virtual one to a believable one. Surreal
textures and atmospheric gestures can generate emotional responses in
a player as varied as the soundscapes themselves. As games become more
and more complex and graphically spectacular, we must not overlook the
role of audio in enhancing and completing that feeling of total immersion.
______________________________________________________
|