Audio in today's interactive entertainment media has progressed far beyond the bleeps of early video games. An object or an environment within a game exhibits a number of complex relationships. A creature may be surprised to see you. A robot's gears get stuck when it tries to move toward you. A diabolical enemy is afraid of the dark. When encountering these elements in a game environment, we expect them to communicate to us through audio in subtle and different ways. Aspects of emotion such as surprise, frustration, admiration, and fear could easily be conveyed through an enhanced and well thought-out object vocabulary.
Our lives are full of an ever-present collage of audio cues that we take for granted. For example, at this "quiet moment," I can hear the cascading sound of a fountain in a pond, the intermittent quacking of ducks and geese, a baby in the background, someone pouring a bucket of water outside, and a plane flying overhead. All of these cues, though subtle and seemingly unimportant, create the ambience of a particular scene, imbuing it with identity and significance. Without these background sounds, or ambiences, our lives would sonically resemble a lunar landscape. A collection of sound cues such as this within a game environment refers to the noncausal relationship of a player to the game. The sound space isn't triggered by the player's direct action. Instead, the sound is affected by and reacts to the environmental aspects of the scene that is being conveyed.
When we go to a movie, our emotional response is directly related to the music. The music swells, our anticipation grows, and our adrenaline rushes. The music ebbs, and we feel a calming sensation. This is very easy to convey in a linear medium, where the ending and the progression of events in a movie is predetermined; but how do we compose a soundtrack to a game if it can follow many paths and endings? An adaptive soundtrack that responds well to game events is one of the best ways to envelop the player in a game experience.
Audio Object Vocabulary
An audio object vocabulary is a method by which game objects (not necessarily just speaking ones) talk to each other and the player. The methods of communication vary from object to object and from context to context. There are three types of object interaction: direct, indirect, and environmental.
An object communicates directly as a cause of direct action on its part. When the ball hits the paddle in the old arcade game Pong, it makes a bleep. This is direct object interaction. Unfortunately, most games haven't explored far beyond this simplistic level of object interaction. Direct communication is important when you want to convey specific audio cues, such as a scream of pain when you shoot a monster, or the creak of a wooden rocker when you push back a rocking chair. In Monolith Production's Claw, I found it important that every character had something different to say when you interact with him or her (or it), even if it's in combat. For example, a melodramatic character, while dying, would say "I'm dying... I'm dying... I'm dying... I'm dead," with an animation to suit. A more primitive character would emit a squawk, and a more substantial enemy would yell out, "I curse you, Claw" as he falls to his death. When you hit a lounge-lizard-turned palace-guard-merman, he would say, deadpan, "Ouch that hurt quite a bit."
As always, a variety of audio cues are paramount in ensuring that a set of quotes doesn't become repetitive. From a programming standpoint, that may require a bit more intelligence to pick out the quotes. A buffer with an index to the most recently used quotes helps a lot because it shields the player from experiencing the same "random" set of sounds in rapid succession.
This is an indirect method of object interaction. That is, by causing something to happen in the game, something else responds sonically. A typical example of this is a "sighting" state for an enemy. When an enemy sees you, and his or her AI changes, a sonic cue that signifies that change may be appropriate. In Monolith's Blood, for example, cultists scream in a terrifying foreign language (created for the game drama) a series of epithets when they spot the player (Figure 2). In Claw, every enemy has something different to say in the "sighting" state. A female boss taunts Claw in a mildly suggestive manner when they come into contact. A goofy bear sailor exclaims "I don't like you" when he sees a player.
Other sonic cues may convey indirect object interaction. Your character may begin breathing heavily when he or she is tired (health is less than some coefficient). Your metal body suit emits a rubbing, squeaky noise that signifies rusting. In addition to sonic cues that help convey complex visual phenomena, certain characters within the game display behaviors that can be conveyed easily through sonic cues, even if they aren't represented visually. Indirect cues can be based on a number of different motivating factors, the rules of which can be determined at the game design stage. For example, in Blizzard's Warcraft II, clicking on an ax-throwing troll more than once causes it to respond with annoyance, even though no animation is being shown. This is highly effective character enhancement.
A character or object in the game may generate a system of audio cues on its own, irrespective of its communication to the player. This is purely a function of a character's existence in its environment. It may be busy chatting to itself or other characters. It may generate a sound or a series of sounds on its own. Our goofy bear sailor from Claw will comment on how hungry he is or where his pet rat might be when he's in an idle state (Figure 3). Depending on where he is in the game, Caleb (the character you play in Blood) may pick from a variety of different show tunes to sing while he's taking a break from the carnage. A thespian tiger from Claw recites different Shakespearean passages as he muses on his own omnipotence.
Environmental communication need not be comic, nor does it need to be vocal. A swishing blade and a humming motor sound signifies an industrial fan in Blood, while a phone may be ringing intermittently. A character may pass by an alien hive, with pods emitting a terrifying whine.
Environmental communication is paramount in reinforcing a character or object's existence in the game environment. The character literally comes alive as a personality or physical entity. But as with all different types of object interaction, it's important to remember to keep a consistent set of sounds from character to character. In Claw, I made a decision to use three different idle cues (environmental communication), four different sighting cues (indirect communication), and between eight and nine sounds (direct communication) to describe each character sonically. In the end, most characters used more and some less than that average. However, planning the audio object vocabulary ahead of time helped to maximize the use of memory allotted to sound in the game.
The nature of a game object must be relayed in the character of its "voice." It's very easy to screw up the integrity of a character by giving different visual and aural personalities. However, giving the right "voice" can greatly enhance a character's personality. A weak character may be depicted through the use of a humorous voice. A stronger character's dramatic personae can be highlighted through the use of a deeper and more resonant voice, as well as a script that relates without question his or her authority.
TIPS AND TECHNIQUES.
Ambient sound refers to the sound world that is generated from a player's location in the game space. It is a system of indirect and environmental cues that immerse a player in a particular setting. As in my real-world example, we are surrounded by ambience all of our lives - a complex web of sound. However, ambience is the most underdeveloped side of sound design in interactive media. A game with little or no ambient sound presents little or no connection to how we perceive the outside world with our ears. An ambient sound world might be as simple as a single looping track of forest sounds or a system of sound-producing objects all linked together by their location within a given game environment.
The environment can communicate to the player information important for the game-playing experience. For example, a raven flies by in a forest, making a screeching sound that informs the player that he or she has ventured too far. A swamp makes a menacing gurgling sound, informing the player that he or she shouldn't go there. The sound of a portal opening and closing in the distance informs the player that he or she is close to the level's exit.
Environmental ambiences fully transport a player into the world presented by the game. In Claw, each level has a distinct set of ambient sounds based on the terrain that the main character is encountering. Within a terrain, a single (environmental) looping sound is used (such as the sound of a forest), along with a set of sounds (indirect cues) that are triggered either by Claw's location on the map or by random chance. For example, the sound of a character whistling in a window matches the animation of the character shaving and the background ambience of village noise. When Claw moves through another terrain, the looping ambient sounds would cross-fade, and another set of ambient trigger sounds would be selected that corresponds to the new terrain.
In Blood, I used ambience to enhance the atmosphere, as well as to connote physical environments. In a temple, distant chanting is heard (though the source of the chant is never discovered) (Figure 4). In a narrow hallway, whispers surround the player from all sides. The inclusion of atmospheric elements adds to the spooky and scary nature of the game's look and feel.
Tips and Techniques
The nonlinear medium of computer gaming can lead a player down an enormous number of pathways to an enormous number of resolutions. From the standpoint of music composition, this means that a single piece may resolve in one of an enormous number of ways. Event-driven music engines (or adaptive audio engines) allow music to change along with game state changes. Event-driven music isn't composed for linear playback; instead, it's written in such a way as to allow a certain music sequence (ranging in size from one note to several minutes of music) to transition into one or more other music sequences at any point in time. An event-driven music engine must contain two essential components:
In Kesmai's Multiplayer Battletech, control logic determined the selection of segments within a game state and the selection of sets of segments at game state changes. Thus, the control logic was able to construct melodies and bass lines out of one to two measure segments following a musical pattern. At game state changes, a transition segment was played, and a whole different set of segments was selected. However, this transition segment was played only after the current set of segments finished playing so as not to interrupt the flow of the music. I selected game states and also tracked game state changes based on the player's relative health vs. the health of the opponent. Overall, I composed 220 one to two measure segments that could all be arranged algorithmically by the control logic. What resulted was a soundtrack that was closely coupled with the game-playing experience.
Tips and Techniques
Total Immersion through Sound
As game designers and audio producers, we should be constantly aware of the impact that a well thought-out audio environment can have on the product. It can make a graphically simple and uneventful scene become awe-inspiring. Effective use of an audio object vocabulary can enhance the impact a character may have on the game player. Ambient sounds, in all of their variety, can transform a game scene from a virtual one to a believable one. Surreal textures and atmospheric gestures can generate emotional responses in a player as varied as the soundscapes themselves. As games become more and more complex and graphically spectacular, we must not overlook the role of audio in enhancing and completing that feeling of total immersion.