[In the first part of a new analysis, high-end game audio veteran Rob Bridgett (Scarface, 50 Cent: Blood In The Sand) examines Skywalker Sound's mixing on the remastered Ghost In The Shell movie, then extrapolates to ask - is real-time mixing of sound effects, music and dialog in games an important part of the future of AAA game audio?]
Over the last five or so years, interactive mixing has developed considerably. From basic dialogue ducking systems to full blown interactive mixer snapshots and side-chain style auto-ducking systems, there is currently a wide variety of mixing technologies, techniques and applications within the game development environment.
The majority of these systems are in their infancy; however, some are already building on several successful iterations and improving on usability and quality. I have believed for some time now that mixing is the single remaining area in video game audio that will allow game sound to articulate itself as fully as the sound design in motion-pictures.
As you'll see later in the article, I recently had the chance to sit in on a motion-picture final mix at Skywalker Ranch with Randy Thom (FX Mixer) and Tom Myers (Music and Foley Mixer) on a remix project for the original 'Ghost in the Shell' animated feature for a new theatrical release. I was able to observe first-hand the work-flow and established processes of film mixing and I attempted to try and translate some of them over into the video game mixing process.
The comparison and the possibilities for game audio is especially true now that we are able to play back samples at the same resolution as film and are able to have hundreds of simultaneous sounds playing at the same time.
The current shortfall in terms of mixing technology and techniques has lead to games not being able to develop and mature in areas of artistic application and expression. While examples in film audio give games a starting point at which to aim, it certainly is only a starting point, as many film mixing techniques can only translate so far once mapped onto the interactive real-time world of games.
Interactive mixing offers up a completely different set of challenges from those of mixing movies, allowing for the tuning of many more parameters per sound, or group of sounds, than just volume and panning.
Not only this, but due to the unpredictable interactive nature of the medium, it is entirely possible to have many different mixer events (moments) all occurring at the same time, fighting for priority and affecting the sound in unpredictable and emergent ways.
There are a whole variety of techniques required to get around these challenges, but in the end, and looking beyond all the technical differences in the two media, it all comes back to being able to achieve the same artistic finesse that is achieved in a motion picture sound mix.
To begin, I have chosen to highlight some of the techniques and concepts used in linear film mixing so we can quickly see what is useful to game sound mixing and how each area translates over to the interactive realm. But first, I should clarify a few simple mixer concepts that often get confused.
Film Standards: Grouping
Perhaps one of the most basic concepts in mixing, grouping is the ability to assign individual sounds to larger controller groups. It's essential in being able to efficiently and quickly mix a movie or a piece of music.
In music, a clear example is that of all the different drum parts that go into making up a drum kit, you have the hi-hats, the kick-drum and so on, all mic'd separately. All of these individual channels are then pre-mixed and grouped into a single parent bus called 'drums' - when the 'drums' bus is made quieter or louder, all of the 'child buses' belonging to it are attenuated by the same amount.
In film this can be as simple as the channels for each track being routed through a master fader, or belonging to generic parent groups called 'dialogue', 'foley', or 'music'. In terms of video games, having a hierarchical structure with parent and child buses is absolutely essential, and the depths at which sounds can belong to buses often goes far deeper than in film mixes.
For example, a master bus would contain 'music', 'sfx' and 'dialogue' buses, within the 'sfx' bus there would be 'weapons', 'foley', 'explosions', 'physics sounds' etc. Within the 'weapons' bus there would be 'player weapons' and 'non player weapons', within 'player weapons' would be 'handgun', 'pistol', 'machine gun', within 'machine gun' would be 'AK47', 'Uzi', etc. Within 'AK47' would be the channels 'shell casings', 'gun foley', 'dry fire', 'shot', 'tail' and so on.
These bus levels need to go much deeper because at any given moment in a mix you could need to control all of the 'sfx' parameters as a group, or all of the weapons parameters, or just the AK47, or just a single element of the AK47 such as the shell casings.
In film this is easier to do because you have frame by frame control over the automation of linear 'tracks' on a time-line, so at any moment you can decide the volume, EQ or pitch of any number of individual sounds on simultaneous tracks. In games however, we are talking about many individual triggered events, that can occur at any frame in the game.
For a quick hands-on investigation into video game busing, third-party engines such as Audiokinetic's Wwise feature a very rounded and simple to use example of this 'parent / child' bus structure and can be quickly used to set up some complex routing allowing a great deal of control over the mix of large scale or very detailed elements. FMOD also has the concept of Channel Groups which act in much the same way.
Bus grouping in Wwise