It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.

Search articles, jobs, buyers guide, and more.

Special Section
By
Mark Miller
Gamasutra
November 2, 1999

Related Articles:

Microsoft's DirectMusic Producer

Letters to the Editor:
Write a letter
View all letters


Features

New Tools and Technologies

Contents

Introduction

MPEG-4/ Intellectual Property Issues and DLS

Interactive Composition

As powerful and necessary as standards can be, the cutting edge of technology is driven primarily by the innovation of the leading hardware, software, and tools vendors. This section focuses on the emerging technologies and standards that have been identified thus far, such as DLS, 3D audio and interactive composition, with a long-range view into their future. It also introduces new technologies that have barely reached the market but will probably have a major impact in the future, such as physical modeling synthesis, interactive audio on the Internet, and MPEG-4.

As seen in the preceding sections, the individual components that make up the interactive audio platform (the MIDI renderer, 3D audio renderers, and so on) have improved in concept, been accelerated in hardware, and now have standardized APIs to support them. Still, these components are chronically underutilized. Unfortunately, being creative with today’s most advanced audio technologies still involves a great deal of programmer time and effort. This lack of user-level tools translates into the inability of the most motivated person on the team, the sound designer, to do the work required to take advantage of these new and powerful resources.

There are a number of solutions to this problem. One is to assign a motivated, full-time programmer to work closely with the sound designer. This is a great solution but hard to justify for many small to mid-sized teams. Another solution is to hire only sound designers who are able to program the tools that they need. In some cases, this works well, but it severely limits the scope of creative talent that might otherwise be available for a given project. Lastly, and most interestingly, better tools could be created. As we look at new technologies, special attention will be paid to those that are accompanied by authoring tools that put the power into the hands of the sound designer.

DLS, DLS 2, and MPEG-4

Now that all of the pieces necessary to move DLS 1.0 into mainstream use are finally in place, it is time to look forward to DLS 2.0. (If you are not familiar with the DLS 1.0 architecture, check out http://www.midi.org/dls/dlsoview.htm for an excellent overview.) DLS 2.0 is an extension of the DLS 1.0 standard that combines the best features of DLS 1.0 and Creative Labs’ SoundFonts 2.0 specification. It will be both backward- and forward-compatible. The main differences between DLS 1.0 and DLS 2.0 can be summarized as follows:

• The minimum amount of memory has been increased to 1024KB of 16-bit samples.

• A two-pole resonant low-pass filter has been added.

• An additional low-frequency ocillator (LFO) for pitch modulation (vibrato) has been added.

• Two new envelope stages have been added to the original Attack, Decay, Sustain, Release (ADSR) envelope. The first is a "delay" segment before the onset of the initial attack, and the second is a "hold" segment between the attack and the release phase.

• A number of new "connections" (between modules) have been added to support the new modules and features.

• DLS 1.0 made a distinction between "melodic" and "drum" instruments. Melodic instruments could have up to 16 regions or samples spread across the keyboard (by note range). All of the regions, however, were modified by a single set of articulation data (containing settings for the LFO, the two discrete ADSR envelope generators, and the several available MIDI controller inputs). Drum instruments, on the other hand, could have up to 128 regions, and each one could have its own unique articulation data. In DLS 2.0 this distinction is removed; while instruments are still identified as "drum" or "melodic" instruments, either type may now have up to 128 regions that either share one set of articulation data or have unique sets of articulation data.

• Reverb and chorus sends have been added. (Unfortunately, as has been the case since the dawning of the GM specification, the exact behavior of reverb and chorus effects is still undefined.)

Given these enhancements, it is clear that DLS 2.0 defines a genuinely useful format for creating custom wavetable sounds. From another perspective, though, PCM sounds are an imperfect solution to many sound-design problems in games. Sampled sounds, for example, can be viewed as very short PCM audio tracks. As such, they inherit many problems from their larger cousins. For example, most of what is interesting sonically in a PCM sound is prerendered within the actual data. At run time, there is not much you can do to change it. The sound can be filtered, made louder or softer, shifted in pitch, or post-processed with digital-signal processing effects (such as reverb). But adding more complex modifications, like a "growl" to a saxophone, or additional debris to an explosion, interactively require new sample data that encapsulates the modification to be added to the sound bank. In other words, as the desired run-time complexity of the sound’s behavior increases, so does its size. This is a serious problem in memory- and bandwidth-limited environments.

Beyond Wavetable

One emerging solution to this problem, physical modeling synthesis, is being explored and marketed to the interactive audio community by both a small start-up company, Staccato Systems, and XG-proponent Yamaha. Physical modeling takes a radically different approach to the creation and playback of sounds. Instead of representing sounds by recording individual performances, the complex physical models by which a sound source actually creates and modifies sound waves are replicated by the synthesizer. If done correctly, the results are truly remarkable both in terms of their sound quality and flexibility.

Physical modeling has many benefits for interactive audio. For one, physically modeled sounds are much more flexible than PCM sounds. Since the sound is "modeled," it can produce more meaningful and interesting variations. Another advantage is that the data that describes physical models is generally much smaller than that required for PCM instruments. This difference is especially dramatic when you consider that the wide range of variations physical models produce are invoked parametrically in the synthesizer — no additional sample data must be downloaded.

For example, if you "step on the gas" in a game, a physically modeled car engine revs smoothly and realistically according to the physics of a combustion engine. Creating this same smooth transformation of the sound using PCM data would be difficult at best, and in the worst case it would be impossible due to space constraints. Such real-time variations open up tremendous possibilities for interactive music and sound effects, as they can be invoked in response to player actions. The first commercial application of this technology for sound effects was seen in EA’s Nascar Revolution, in which Staccato Systems’ Audio Rendering Technology was licensed to create ultra-realistic and responsive racing car engines. On the musical side, Yamaha is including a full complement of physically modeled musical instruments in their top-of-the-line software synthesizer, the S-YXG 100 plus VL.

There are, of course, some issues with the integration of physical modeling into the game environment. Most obviously, physical models are more taxing to the CPU than wavetable playback. They are also notoriously difficult to create. A combination of Moore’s law and, hopefully, the emergence of hardware acceleration for this technology will eventually solve the first problem. The second problem is more complicated. Figuring out how to represent air resonating inside of a trumpet is a bit more difficult than simply recording someone playing one. As a result, the first-generation users of physical-modeling synthesizers will most likely be dependent on the models supplied with the synthesizer. Beyond that, some excellent, high-level authoring tools will be required for this technology to truly take off.

Still, this is a very compelling concept, and the first commercial implementations really made an impression on me. In my opinion, physically modeled sounds will definitely make an impact on interactive audio development. Unfortunately I don’t have enough space in this article to explore this technology and its main proponents thoroughly. But look out for a feature story on physical-modeling synthesis and audio rendering technology from Staccato and Yamaha in an upcoming issue of Game Developer. In the meantime, more information is available at http://www.staccatosys.com and http://www.yamaha-xg.com.

Another interesting development for both DLS 2.0 and physical modeling is that they have been incorporated into the recently-completed MPEG-4 standard. Within MPEG, DLS 2.0 will be known as the MPEG-4 Structured Audio Sample Bank Format, and physically modeled instruments can be created using the Structured Audio Orchestra Language (SAOL). The Structured Audio portion of the MPEG-4 standard is very interesting in its own right and deserves some further examination.

___________________________________________________

MPEG-4/ Intellectual Property Issues and DLS


join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2003 CMP Media LLC

privacy policy
| terms of service