Uncharted: Drake's Fortune was a landmark cinematic experience on the PS3, receiving an average of 90% on GameRankings, and is clocking in at well over one million units sold. Thanks to Gene Semel at Sony, we chatted with Jonathan Lanier (audio programmer) who answered these questions with the input of Bruce Swanson (audio director) at developer Naughty Dog.
What advantages did the PS3 provide for audio reproduction with Uncharted?
Jonathan Lanier: Several. First, the fact that the PS3 has HDMI 8-channel PCM outputs means that we could play all our audio in 5.1/7.1 on an HDMI system with no recompression, which sounds completely awesome. Second, for those without HDMI who must use bitstream audio, we had the ability to support DTS, which is very high fidelity.
Third, we are guaranteed that each PS3 has a hard drive, so we could dynamically cache important sounds and streams to the hard drive to guarantee full performance, even without requiring an installation. Fourth, the Blu-ray disc storage was immense, which meant that we did not have to reduce the sampling rate of our streaming audio or overcompress it, and we never ran out of space even given the massive amount of dialog in Uncharted (in multiple languages, no less).
Fifth, the power of the Cell meant that we had a lot of power to do as much audio codec and DSP as we needed to. Since all audio is synthesized in software on the PS3 with the Cell processor, there's really no limit to what can be done.
About how many simultaneous streams were used, and what techniques were used for transitions in the music?
JL: Up to 12 simultaneous streams were supported, of which 6 could be multichannel (stereo or 6-channel). Two multichannel streams were used for interactive music, each of which was 3-track stereo (i.e. 6-channel). A few additional multichannel streams were used for streaming 4-channel background sound effects. The remainder were used for streaming mono dialog and sound effects. All the streams were dynamically cached to the PS3's internal hard drive, which guaranteed smooth playback.
The music transitions were based on game events, such as changing tasks and/or completing tasks, as well as entering or exiting combat. Also, within a piece of music, we could dynamically mix the three stereo tracks in an interactive stream to change the music intensity based on the excitement level of the gameplay.
Were realtime effects employed?
JL: Uncharted is a fairly realistic soundscape, as opposed to a sci-fi game; so there's not much call for realtime effects. We did use a few, though. There was realtime radio futzing in a few places, when characters were conversing over walkie-talkies. We also had a tinnitus "ear ring" effect that would obscure the sounds while playing the ring to give that "you've almost been killed by a grenade" feeling.
There was a fairly subtle ducking system we used to get voices to play well over effects in certain extremely loud situations (i.e. multiple massive explosions); this was dynamic based on the current RMS power level. There was also a fairly extensive amount of unique reverbs for the different environments.
What was Uncharted's audio memory limit?
JL: The base audio memory budget was about 24MB; this included sound effect data, reverb buffers, and audio metadata. A few megabytes of additional memory was also required for streaming.
Was there any noticeable hit for decompression on ATRAC files?
JL: We did not use ATRAC, so the answer would be "no". We used a slightly modified version of the PS3's VAG codec. This worked well for several reasons. Decompression of this codec is basically almost free using the Cell SPU; we could decompress hundreds of these with no impact to game frame rate, and we never come anywhere near that in practice.
Another reason is that because we are caching all the streams on the hard drive, and because the Blu-ray disc is so large, we didn't have to compromise space versus performance. This means that our streams were relatively uncompressed, with no psychoacoustic artifacts, using a high sampling rate of around 48KHz. As a result, the fidelity of the resulting streaming audio was exceptionally good.
What was the biggest timesaver for the audio team in terms of tools and / or process?
JL: Without a doubt, the biggest timesaver was our technology that allows us to edit and reload all audio metadata and sounds on-the-fly during development, while the game is running. Any audio tweaks could be made almost instantaneously, usually without restarting the game. The ability to iterate as quickly as possible is undoubtedly the most important feature of our process.