Theoretically, we decided to divide the ambience into three tiers, or perspectives. The quadraphonic bed track, taken from the more distant perspective recordings of Manhattan, became one of our main "background" ambiences. Others included a Central Park background as well as a rooftop background which procedurally fades in and out based on the position of the listener.
Sitting on top of the background ambiences were the grouped layers, or what we like to call "midground" ambience. Midground ambience is entirely composed of ambient sounds made from groups of objects in the world. Pedestrians, vehicles and infected creatures became the main grouped layers of ambience as they were all central to Prototype's open-world gameplay.
The last tier is what we would call foreground ambience. Foreground ambience is composed of sounds that originate from a single object in the game world. Quite simply, these are individual lines of ambient dialogue, individual vehicle honks, engines, tire skids, etc. or individual creature sounds that play based on the state of the object, determined for the most part by the AI.
The main advantage of this tiered approach is the blending you can achieve from foreground to background which acts to provide a kind of aural depth of field. Because you get individual reactions and ambient sounds form objects in the immediate foreground, the midground and background layers blend in to provide a sense of depth to the audio. This way, the individual sounds don't stand out as awkwardly loud or prominent in the mix because of the blended grouped content underneath.
Another advantage to this approach is you can be frugal with the use of voices for ambient foreground sounds thanks to the support of the midground, grouped sounds. This allowed us to set maximums on individual pedestrian voices, vehicle engines and other foreground objects. In a game which features hundreds of these objects on the screen at any given time, this proved very important to reserve voices for other more important sounds like the main characters powers, combat sounds, prop damage states, etc.
For reasons of disc-streaming efficiency, we decided to create an interleaved, multichannel ambience file that loops in the game and dynamically mixes according to the densities of certain objects in the vicinity of the listener.
The reason for the interleaved, 18-channel file was purely to limit the number of seeks on the disc the system would have to make in order to stream background and midground ambiences simultaneously.
Pedestrians, traffic and infected enemies all travelled in groups, so each of these elements had their own set of layers in the 18-channel file. All foreground ambiences were conventionally preloaded into RAM with the characters and objects they belonged to.
When panic ensues in the world (as it is prone to do in Prototype) the panic layers are turned up depending on the densities of pedestrians in the area. If there are only a few pedestrians, they will only respond individually. If there is a group of 10 or so, the first, low-density midground layer will fade up. If there are even more, a midground crowd layer with increased density will fade giving more of a crowd effect.
The same technology is applied to infected hordes and roughly the same to vehicle traffic. Traffic also has an "idle" state which fades up when the numbers of cars is high enough to warrant it. A traffic panic layer fades up when cars begin to panic.
Running behind all of this is the basic, four-channel city ambience, which cross-fades with a rooftop ambience based on the listener's height, and a Central Park ambience for when inside the park.
Because all streams are running simultaneously, these cross-fades are all position-based, not trigger-based. When standing on the edge of the park and the city the player hears 50% park and 50% city rather than having to cross a trigger volume to trigger a preset cross-fade.