Our Properties: Gamasutra GameCareerGuide IndieGames Indie Royale GDC IGF Game Developer Magazine GAO
My Message close
Contents
The Future Of Game Audio - Is Interactive Mixing The Key?
 
 
Printer-Friendly VersionPrinter-Friendly Version
 
Latest News
spacer View All spacer
 
February 9, 2012
 
What Nintendo's 2011 sales mean for Wii U, third parties
 
DICE 2012: Culture, pride lead to success at Skyrim maker Bethesda [3]
 
DICE 2012: Is the publishing model broken? [14]
spacer
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
February 9, 2012
 
2K Marin
FX Artist - XCOM
 
Visual Concepts
Senior Producer, VC China (Shanghai)
 
Visual Concepts
Software Engineer, VC China (Shanghai)
 
Zindagi Games
Presentation/Game Programmer
 
Visceral Games Redwood Shores
Sr. Gameplay Engineer-Visceral Games
 
Visceral Games Redwood Shores
Sr. Audio Artist-Visceral Games
spacer
Latest Features
spacer View All spacer
 
February 9, 2012
 
arrow Postmortem: CyberConnect 2's Solatorobo: Red the Hunter
 
arrow Jerked Around by the Magic Circle - Clearing the Air Ten Years Later [32]
 
arrow Building the World of Reckoning [4]
 
arrow SPONSORED FEATURE: TwitchTV - How to Build Community Around Your Game in 2012 [13]
 
arrow Happy Action, Happy Developer: Tim Schafer on Reimagining Double Fine [9]
 
arrow Building an iOS Hit: Phase 1 [11]
 
arrow Postmortem: Appy Entertainment's SpellCraft School of Magic [5]
 
arrow Talking Copycats with Zynga's Design Chief [82]
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
February 9, 2012
 
The Devil Is in the Details of Action RPGs - Part One: The Logistics of Loot [2]
 
Xbox LIVE Indie Games at it Again
 
Merging Waterfall and SCRUM [3]
 
Business Post Mortem: Wolf Toss: Pre-launch Planning & Blended CAC
 
Minmaxing - Is turn-based fun anymore? [53]
spacer
About
spacer Editor-In-Chief/News Director:
Kris Graft
Features Director:
Christian Nutt
Senior Contributing Editor:
Brandon Sheffield
News Editors:
Frank Cifaldi, Tom Curtis, Mike Rose, Eric Caoili, Kris Graft
Editors-At-Large:
Leigh Alexander, Chris Morris
Advertising:
Jennifer Sulik
Recruitment:
Gina Gross
 
Feature Submissions
 
Comment Guidelines
Sponsor
Features
  The Future Of Game Audio - Is Interactive Mixing The Key?
by Rob Bridgett [Audio]
5 comments Share on Twitter Share on Facebook RSS
 
 
May 14, 2009 Article Start Page 1 of 5 Next
 

[In the first part of a new analysis, high-end game audio veteran Rob Bridgett (Scarface, 50 Cent: Blood In The Sand) examines Skywalker Sound's mixing on the remastered Ghost In The Shell movie, then extrapolates to ask - is real-time mixing of sound effects, music and dialog in games an important part of the future of AAA game audio?]

Over the last five or so years, interactive mixing has developed considerably. From basic dialogue ducking systems to full blown interactive mixer snapshots and side-chain style auto-ducking systems, there is currently a wide variety of mixing technologies, techniques and applications within the game development environment.


The majority of these systems are in their infancy; however, some are already building on several successful iterations and improving on usability and quality. I have believed for some time now that mixing is the single remaining area in video game audio that will allow game sound to articulate itself as fully as the sound design in motion-pictures.

As you'll see later in the article, I recently had the chance to sit in on a motion-picture final mix at Skywalker Ranch with Randy Thom (FX Mixer) and Tom Myers (Music and Foley Mixer) on a remix project for the original 'Ghost in the Shell' animated feature for a new theatrical release. I was able to observe first-hand the work-flow and established processes of film mixing and I attempted to try and translate some of them over into the video game mixing process. 

The comparison and the possibilities for game audio is especially true now that we are able to play back samples at the same resolution as film and are able to have hundreds of simultaneous sounds playing at the same time.

The current shortfall in terms of mixing technology and techniques has lead to games not being able to develop and mature in areas of artistic application and expression. While examples in film audio give games a starting point at which to aim, it certainly is only a starting point, as many film mixing techniques can only translate so far once mapped onto the interactive real-time world of games.

Interactive mixing offers up a completely different set of challenges from those of mixing movies, allowing for the tuning of many more parameters per sound, or group of sounds, than just volume and panning.

Not only this, but due to the unpredictable interactive nature of the medium, it is entirely possible to have many different mixer events (moments) all occurring at the same time, fighting for priority and affecting the sound in unpredictable and emergent ways.

There are a whole variety of techniques required to get around these challenges, but in the end, and looking beyond all the technical differences in the two media, it all comes back to being able to achieve the same artistic finesse that is achieved in a motion picture sound mix.

Film Standard Features

To begin, I have chosen to highlight some of the techniques and concepts used in linear film mixing so we can quickly see what is useful to game sound mixing and how each area translates over to the interactive realm. But first, I should clarify a few simple mixer concepts that often get confused.

  • Fader - A software or hardware slider that is used to control the volume of a sound.
  • Channel - A single channel, representing the volume and parameters of a single sound, usually manifested by a fader, sometimes called a 'channel strip' when including other functionality above the fader such as auxiliary sends or EQ trim pots. (Not to be confused with a 'speaker channel', a term used to describe the source of a sound from a particular speaker, Left, Right, Centre, left Surround etc)
  • Bus / Group - This is the concept of a parent channel which has global control over the values of other child channels as a collective group. This is also usually represented by a fader in either software and/or mirrored in hardware.
  • Mixer Snapshots - A 'snapshot' of all the volume and parameter positions and settings of an entire set of channels and buses, like a photograph of the mixing board capturing all its settings at any particular moment. Blending from one snapshot to another results in automated changes in these parameters over time.

Film Standards: Grouping

Perhaps one of the most basic concepts in mixing, grouping is the ability to assign individual sounds to larger controller groups. It's essential in being able to efficiently and quickly mix a movie or a piece of music.

In music, a clear example is that of all the different drum parts that go into making up a drum kit, you have the hi-hats, the kick-drum and so on, all mic'd separately. All of these individual channels are then pre-mixed and grouped into a single parent bus called 'drums' - when the 'drums' bus is made quieter or louder, all of the 'child buses' belonging to it are attenuated by the same amount.

In film this can be as simple as the channels for each track being routed through a master fader, or belonging to generic parent groups called 'dialogue', 'foley', or 'music'. In terms of video games, having a hierarchical structure with parent and child buses is absolutely essential, and the depths at which sounds can belong to buses often goes far deeper than in film mixes.

For example, a master bus would contain 'music', 'sfx' and 'dialogue' buses, within the 'sfx' bus there would be 'weapons', 'foley', 'explosions', 'physics sounds' etc. Within the 'weapons' bus there would be 'player weapons' and 'non player weapons', within 'player weapons' would be 'handgun', 'pistol', 'machine gun', within 'machine gun' would be 'AK47', 'Uzi', etc. Within 'AK47' would be the channels 'shell casings', 'gun foley', 'dry fire', 'shot', 'tail' and so on.

These bus levels need to go much deeper because at any given moment in a mix you could need to control all of the 'sfx' parameters as a group, or all of the weapons parameters, or just the AK47, or just a single element of the AK47 such as the shell casings.

In film this is easier to do because you have frame by frame control over the automation of linear 'tracks' on a time-line, so at any moment you can decide the volume, EQ or pitch of any number of individual sounds on simultaneous tracks. In games however, we are talking about many individual triggered events, that can occur at any frame in the game.

For a quick hands-on investigation into video game busing, third-party engines such as Audiokinetic's Wwise feature a very rounded and simple to use example of this 'parent / child' bus structure and can be quickly used to set up some complex routing allowing a great deal of control over the mix of large scale or very detailed elements. FMOD also has the concept of Channel Groups which act in much the same way.


Bus grouping in Wwise

 
Article Start Page 1 of 5 Next
 
Comments

Tom Newman
profile image
Great article! What is not mentioned however (and this could be getting too much in-depth) is often in any mixing situation, you are dealing with audio from many different sources that were mastered at different compression levels. Audio compression probably deserves an article of it's own (or a book), but in a nutshell, dB is the scientific measurement of sound volume, but the human ear is much different. An uncompressed audio signal can sound very quiet to the human ear, but measure very loud on a dB meter. A signal that has too much compression can be an even bigger mess, as it will sound much louder than anything else in the mix despite what the dB meters say, and it is much much easier to compress than it is to decompress without access to the source clip. This really is a wrench in the system when it comes down to automated mixing, as a computer can only read the meter.

Overall a great read, and I very much look forward to what the future has in store for real-time audio mixing handled in-game.

Roger Hågensen
profile image
Nah, perceived levels are not an issue.

Just do a RMS calculation and store that with the audio files.
I advise to use a -20 dBFSrms (aka. SMPTE 83dB SPL, aka K System K20, ReplayGain is close to these standards.)
At the same time, also note the max peak of the audio file.

At mixing time the mixer could use the stored max peak to avoid clipping, and the stored RMS gain value to "volume match" the perceived loudness. Dialog drowning should no longer be an issue for example.


Levon Louis
profile image
Yea what he said (above, nice one Roger)

And yes, a great article... Game audio and Film audio are kissing cousins, however, the gap will grow with time as Game audio is pushed forward... Gotta keep in mind folks, we have had sound for film since the late 1800s while recorded sound for games is new. Lets give ourselves some credit for being pioneers of creative sound working in 3D space (does that make us astronauts?)

On a good day we definitely apply the best practices of sound design from 100 years of film-making, but our world is its own, and our universe is expanding with every new title. Interactive mixing, dynamically generated environmental BGs, and other special sauce will ultimately allow us to do WAY MORE with sound in the game space than film (due to its static nature) Film will always have its place (people love them movin pictures) but game audio holds more potential for innovation.... so all you fellow game audio dudes out there remember - it is our unique privilege and solemn duty to tame the frontier and see just how far we can take this thing. Happy trails.

Adem Can
profile image
www.chatyeli.gen.tr allah razı olsun thank you

Simon Carlile
profile image
Great article and a nice way to overview some of the tensions between the film and games audio disciplines. Point-of-View mixing, while perfect for a linear medium like film, has problems when, in the non-linear unfolding of the game narrative, it may not be clear what is subjectively most important to the player. Of course there are circumstances where this is not the case but in the non-linear world of real life the brain posses the capability to focus attention both reflexively and voluntarily (so called exogenous and endogenous attention) to what is compellingly relevant at any instant in time. There can be hundreds of simultaneous sound objects when we cross the road but fortunately we hear out the approaching truck pretty reliably. But to allow that capability in game, the sound objects need to be rendered in a way that the brain expects so that the information they represent can be effectively processed. Virtual Reality research demonstrates that plausibility and consistency are very important in generating the sense of presence and supporting in-world performance. There is a need then to attend to the “objective” characteristics of the sound object (particularly environment ambience).

Having said all that, it does strike me that this dramatic V’s literal division is a bit of a false dichotomy as in any design both elements can be judiciously combined.

A great read and I am really looking forward to the next instalment. Thanks


none
 
Comment:
 




UBM Techweb
Game Network
Game Developers Conference | GDC Europe | GDC Online | GDC China | Gamasutra | Game Developer Magazine | Game Advertising Online
Game Career Guide | Independent Games Festival | Indie Royale | IndieGames

Other UBM TechWeb Networks
Business Technology | Business Technology Events | Telecommunications & Communications Providers

Privacy Policy | Terms of Service | Contact Us | Copyright © UBM TechWeb, All Rights Reserved.