[In the first part of a new analysis, high-end game audio veteran Rob Bridgett (Scarface, 50 Cent: Blood In The Sand) examines Skywalker Sound's mixing on the remastered Ghost In The Shell movie, then extrapolates to ask - is real-time mixing of sound effects, music and dialog in games an important part of the future of AAA game audio?]
Over the last five or so years, interactive mixing has developed
considerably. From basic dialogue ducking systems to full blown interactive
mixer snapshots and side-chain style auto-ducking systems, there is currently a
wide variety of mixing technologies, techniques and applications within the
game development environment.
The majority of these systems are in their infancy; however, some are
already building on several successful iterations and improving on usability
and quality. I have believed for some time now that mixing is the single remaining area in video game
audio that will allow game sound to articulate itself as fully as the sound
design in motion-pictures.
As you'll see later in the article, I recently had the chance to sit in on a motion-picture final mix
at Skywalker Ranch with Randy Thom (FX Mixer) and Tom Myers (Music and Foley
Mixer) on a remix project for the original 'Ghost in the Shell' animated
feature for a new theatrical release. I was able to observe first-hand the
work-flow and established processes of film mixing and I attempted to try and
translate some of them over into the video game mixing process.
The comparison and the possibilities for game audio is especially true now that we are able to play back samples
at the same resolution as film and are able to have hundreds of simultaneous
sounds playing at the same time.
The current shortfall in terms of mixing technology and techniques
has lead to games not being able to develop and mature in areas of artistic
application and expression. While examples in film audio give games a starting
point at which to aim, it certainly is only a starting point, as many film
mixing techniques can only translate so far once mapped onto the interactive
real-time world of games.
Interactive mixing offers up a completely different set of challenges
from those of mixing movies, allowing for the tuning of many more parameters
per sound, or group of sounds, than just volume and panning.
Not only this, but
due to the unpredictable interactive nature of the medium, it is entirely
possible to have many different mixer events (moments) all occurring at the
same time, fighting for priority and affecting the sound in unpredictable and
emergent ways.
There are a whole variety of techniques required to get around these
challenges, but in the end, and looking beyond all the technical differences in
the two media, it all comes back to being able to achieve the same artistic
finesse that is achieved in a motion picture sound mix.
Film Standard Features
To begin, I have chosen to highlight some of the techniques and
concepts used in linear film mixing so we can quickly see what is useful to
game sound mixing and how each area translates over to the interactive realm.
But first, I should clarify a few simple mixer concepts that often get
confused.
-
Fader -
A software or hardware slider that is used to control the volume of a sound.
-
Channel
- A single channel, representing the volume and parameters of a single sound,
usually manifested by a fader, sometimes called a 'channel strip' when
including other functionality above the fader such as auxiliary sends or EQ
trim pots. (Not to be confused with a 'speaker channel', a term used to
describe the source of a sound from a particular speaker, Left, Right, Centre,
left Surround etc)
-
Bus / Group - This is the concept of a parent channel which has global control
over the values of other child channels as a collective group. This is also
usually represented by a fader in either software and/or mirrored in hardware.
-
Mixer Snapshots - A 'snapshot' of all the volume and parameter positions and
settings of an entire set of channels and buses, like a photograph of the
mixing board capturing all its settings at any particular moment. Blending from
one snapshot to another results in automated changes in these parameters over
time.
Film Standards: Grouping
Perhaps one of the most basic concepts in mixing, grouping is the
ability to assign individual sounds to larger controller groups. It's essential
in being able to efficiently and quickly mix a movie or a piece of music.
In music, a clear example is that of all the different drum parts
that go into making up a drum kit, you have the hi-hats, the kick-drum and so on,
all mic'd separately. All of these individual channels are then pre-mixed and
grouped into a single parent bus called 'drums' - when the 'drums' bus is made
quieter or louder, all of the 'child buses' belonging to it are attenuated by
the same amount.
In film this can be as simple as the channels for each track being
routed through a master fader, or belonging to generic parent groups called 'dialogue',
'foley', or 'music'. In terms of video games, having a hierarchical structure
with parent and child buses is absolutely essential, and the depths at which
sounds can belong to buses often goes far deeper than in film mixes.
For example, a master bus would contain 'music', 'sfx' and 'dialogue'
buses, within the 'sfx' bus there would be 'weapons', 'foley', 'explosions', 'physics
sounds' etc. Within the 'weapons' bus there would be 'player weapons' and 'non
player weapons', within 'player weapons' would be 'handgun', 'pistol', 'machine
gun', within 'machine gun' would be 'AK47', 'Uzi', etc. Within 'AK47' would be
the channels 'shell casings', 'gun foley', 'dry fire', 'shot', 'tail' and so
on.
These bus levels need to go much deeper because at any given moment
in a mix you could need to control all of the 'sfx' parameters as a group, or
all of the weapons parameters, or just the AK47, or just a single element of
the AK47 such as the shell casings.
In film this is easier to do because you have frame by frame control
over the automation of linear 'tracks' on a time-line, so at any moment you can
decide the volume, EQ or pitch of any number of individual sounds on
simultaneous tracks. In games however, we are talking about many individual
triggered events, that can occur at any frame in the game.
For a quick hands-on investigation into video game busing, third-party
engines such as Audiokinetic's Wwise feature a very rounded and simple to use
example of this 'parent / child' bus structure and can be quickly used to set
up some complex routing allowing a great deal of control over the mix of large
scale or very detailed elements. FMOD also has the concept of Channel Groups
which act in much the same way.
Bus grouping in Wwise