Interactive Music: Merging Quality with Effectiveness
March 27, 1998
Video game music that changes depending on the player's actions or surroundings, known also as "interactive music" to the hip marketing people, isn't a new phenomenon. Back in the 1980s a game called Popeye had a frantic theme play when hearts that the muscle-bound cartoon character was supposed to catch would slowly slide into the water at the bottom of the screen. More recently the computer hit System Shock by Origin Systems had theme music for different areas in each level that fit each area like a glove, enhancing the experience far more than a single theme for each level. Now the question on most composers' minds isn't just how to make game music interactive, but how to make it more realistic and, most of all, exciting. For the sake of both layman and veteran, the purpose of this article is to lay down what is being done in the industry on both the hardware and game development fronts, and why the mass confusion developers and hardware suppliers seem to be experiencing is real. Developers want to create interactive music, but how, what systems are the best to use, and with what hardware? We'll try to answer these questions as well.
Computer Music Authoring Systems/Tools
For developers who know what they're dealing with, the first question to address is "what systems are there to use?" The three most widely used ways of composing game music are General MIDI / XG / onboard wavetable synth format, which uses the built in synthesizers on soundcards, Redbook audio, which simply streams recorded music from a CD through a digital chipset, and digital modules, aka "MODs", which take custom built samples and play them back through a similar chipset as Redbook. There are advantages and disadvantages to all three formats, especially when used interactively.
MIDI / Soundcard Synth
Using the onboard synth of a soundcard gives an application more memory allocation to visual elements such as 3d processing, script manipulation, and special effects. Since samples are stored on the soundcard itself, all that is needed is information to control the samples, sometimes not exceeding 20k per music file. Onboard synths, which are controlled via an external controller or anything that the user wishes to use to input the music during the composition process, can also fade tracks in and out and even change the music itself interactively. The files can also be created in many different programs, through sequencing programs like "CakeWalk Pro" by Twelve Tone Systems, or even a simple freeware MIDI sequencer.
The biggest drawback to this method is that onboard synths in soundcards such as the "SoundBlaster AWE64" by Creative Labs are essentially scaled down versions of real synthesizers, and don't provide grade-A music quality. For instance, if you took a high end synthesizer such as a Korg M1, with a cost of over $1000, and brought the price down to less than $100, you'd have a pretty weak synth. Redbook audio musicians use up to tens of thousands of dollars (or more) worth of instruments and equipment, so the quality of music coming from something at a fraction of that cost will be relatively lower. Just as an example, the latest onboard synths do not feature extensive effects sections with more than 3 or 4 effects groups such as echo, reverb, distortion, etc. The samples themselves are smaller than those found on professional synthesizers and tone modules, giving them their distinctive mediocre, flat sound that we all hear played on thousands of personal web pages. That isn't to say that consumer cards such as the AWE64 aren't excellent cards in many other ways, they just can't compete with a professional orchestra in terms of sound quality and realism. Creative Labs is trying to answer this problem with "SoundFont" technology, which allows the user to create their own samples, but their soundbank (group of instruments) size is limited to 512k for owners of the AWE64 Value Edition. Most professional synths come with at least 8-12 megabytes of sample information. Yamaha is countering this with their "SW Waveforce" series of cards, and we'll discuss the features of these new cards shortly.
Redbook audio, on the other hand, can be a professional orchestra, a group of synthesizers, or an alternative band. Any music at all can be recorded and streamed directly into the game, which gives it the quality of a movie soundtrack. However, the music itself cannot change (obviously, its been recorded) and tracks cannot overlap or be played simultaneously as using wavetable synthesis can do.
The compromise between the high quality of Redbook and the small size and flexibility of onboard synth development is digital modules (MODs). Many composers have often cast a disapproving eye on MOD music, since methods of composing MODs are complicated and extremely "non user friendly." While an onboard synth composer might use his MIDI (musical instrument digital interface) keyboard to play and sequence tracks, the MOD composer uses a "tracker," which is a piano roll, computer keyboard input based system of composition. Once the initial learning curve is bounded over, however, the power of MOD, or "tracked" music, becomes instantly appealing.
Digital Modules (MODs)
With MOD music, composers have no boundaries, as in Redbook audio. They can create any sample they wish from a WAV or any other Pulse Code Modulation based file and control it much the same way that high end artists use a sequencing program such as "Pro Tools" by Digidesign. Put simply, its like taking an MPEG file, splitting it up into its component instruments, and then putting it back together again. Once the composer loads in the instruments and the playback information they are all stored in a package less than 2 megs in size, which takes up processing speed but also allows for MIDI-like control over the music events and instrument effects. Tracks can be individually faded, blended, and manipulated in much the same way as MIDI files using an onboard synth, but without the restrictions. Effects can be achieved easily by either applying them to samples before they are inserted, or in the tracker itself with a number of specific commands.
Currently the biggest hindrance of MODs, as mentioned before, is the difficulty in actually writing the music using "tracker" programs such as Impulse Tracker or ScreamTracker (both freeware, incidentally). Impulse Tracker does have MIDI control, however the lack of quantization and manual editing of step-time notes can be difficult in such an exact program. Some composers have mastered such tracker programs and produced instrumental and even vocal music to rival the best of Redbook. However the task takes years of practice, years that budding conventional composers don't have. But wait, there is a solution on the horizon. Carlo Vogelsang and his team at Digital Dreams Multimedia, programmers of the MOD-playback engine "Galaxy" (currently in use by Epic Megagames) are developing a sequencer that lets developers use the same control as in wavetable synth composition with the flexibility of MOD sample creation and manipulation. Composers will be able to create or load any sample they wish and sequence it in "Cakewalk / Logic Audio / Cubase" fashion. Samples will be compressed and stored so that processor speed in games or other applications isn't sacrificed, and the resulting music will potentially be as expressive and impressive as commercial Redbook audio.
Using MOD music (specifically Impulse Tracker files) in Unreal has given my team the ability to be as creative as they want to be when writing the tracks while still keeping the sound quality of near to CD music. Not only that but the level designers are also very pleased and impressed with our ability to use MODs to jump from track to track or fade from segment to segment giving a very subtle transition. For instance, if a level designer wanted a fast paced action segment to jump in right at a specific point or fade quietly out, or even switch to another area, we could write that into the music very easily since the Unreal Engine provides the level designers with transition commands in their level editor. The results are from what we have heard very effective, more so than MIDI and Redbook.
The appeal and usefulness of MODs is not something that is attractive to all computer game musicians by any means. Some composers highly prefer having instrument sets predefined. It is much less hassle than hand picking, tuning, and tweaking every instrument for a game soundtrack. The fact that creating Redbook audio lets composers use any means at all for composition also makes it much easier to concentrate on the music itself rather than trying to make support for the different kinds of hardware. MIDI onboard synth, Redbook, and MOD soundtracks all are effective ways of making game music, but most agree that some sort of digital streaming technology will overtake onboard synth efforts. The facts prove conclusive given the recent influx of Redbook audio and MOD music being more popular over General MIDI and even XG and SoundFont tracks. The most advanced games being made today are using the former two, notably Epic Megagames' Unreal (MODs), Ion Storm's Daikatana (Redbook audio). Hits such as Wipeout (Redbook audio), Origin Systems' Crusader series (MODs) and even long before that the great NEC TurboDuo console game Lord's of Thunder (Redbook Audio). Soundcard synth technology may very soon come close to production quality (see below concerning the Yamaha "SW" series), but at the moment the consensus is that streamed digital music sounds sufficiently better in quality to be preferred by most game players.
The question that then comes to mind for the present is "how can manufacturers create effective hardware support for streaming digital music?" Here we come to something of a quandary, and a small history lesson for those who aren't already familiar with it. For over two years the Interactive Audio Special Interest Group (IA-SIG) of the MIDI Manufacturers Association have been trying to incorporate a standard for a "downloadable sample" technology, which in fact will not stream digital audio but use an advanced synth engine. Similar to MOD technology, this standard is slated to allow composers to create samples of any kind, provided they are complied with the ROM or RAM allotted to process them, and use them on special processors on every soundcard manufactured. We will discuss their current progress with this and their "Interactive Audio Engine" shortly, but for those who have been wondering what is going on with downloadable samples, making a standard for them is a very difficult task. This is mainly because manufacturers such as Creative Labs and Yamaha, two of the largest soundcard manufacturers, have plans of their own concerning expanding General MIDI (or GM for short) standards: GM was the standard that set a bank of 127 instruments on all wavetable soundcards. So getting all soundcard hardware to play the same ballgame is a hefty task in itself, and not a task that hasn't been attempted before. Let's recap.
PC Computer Music History 101, From AdLib to GM
In the early days of GM when multitimbre sound modules took center stage of high quality computer game music and the beloved but buried Adlib was being ousted by Sound Blaster, Roland, believe it or not, had such a system of downloadable samples already in place. Their MT-32 external and later LAPC-1 internal sound modules could download a full bank of General MIDI custom made instruments for any game. Sierra took full advantage of this for nearly all their games including Thexder and Sorcerian. However $400, the average retail price of the MT-32 at the time, was not in the average gamer's budget, and certainly not for sound. So the idea floundered to be replaced with General MIDI soundtracks on inexpensive cards, which to this day have not been very successful at sounding the same on every card. Thus the "Team Fat" mission began (composers behind the greatest of GM tracks, among them Origin's Wing Commander and Sierra's 7th Guest) to help composers and manufacturers find the same sound and correct the problem of incompatible sound banks by adhering to the same "bare bones" standards for GM instrumentation. Regrettably even the "Fat General Solution", their brave crusade to make soundcards and composers meet under one banner, has not been very successful despite their outstanding soundtracks and worldwide fame.
Since the great leap from PC Speaker to Adlib sound card in 1987, IBM PC game music has sounded "good", but not "great", and certainly not as polished or unique as 16 bit console or arcade music. Not even as advanced as other PCs such as the Commodore "Amiga" with its 4 channel digital audio (the grandfather of present day MOD technology) and the Macintosh with its 1 channel digital, which provided quick snippets of digital sound that satisfied gamers and game developers alike. Now that CD quality commercial music is possible people are looking for ways to expand it and use it interactively. But hardware limitations, such as increased need of system resources for 3d processing, are creating the need for a standard that, as we have mentioned before, allows composers to create and / or download instruments and their own banks (within a certain size) and write music that sounds as close to the real deal as possible.
New Trends in Hardware
What efforts are being made to further the idea of Downloadable Samples (DLS)? Well, you've all been waiting for a standard to be proposed, and at last, it has. The Interactive Audio Special Interest Group (IA-SIG), a spawn of the MIDI Manufacturers Association, has proposed a set of very bare boned standards for Downloadable Samples called DLS-1, and hardware manufacturers are starting the grind.
Currently the latest steps being taken by Yamaha are in their "SW100" and "SW200" Waveforce soundcards, which enable the developer to create or import their own samples and build their own banks of instruments. Unfortunately the DLS capabilities of the card will only be in use for Windows 98, which will be another few months coming. The SW series comes with 2 megs of RAM and 8 megs of ROM for XG instruments (all 480 of them, quite a leap from GM). There are 36 effects in 3 groups (chorus, reverb, and echo) which packs quite a punch, but still pales in comparison to the 120 odd effects found in a Korg Trinity Pro synthesizer. So far these are the most promising cards I have seen yet that appeal to both the music and sound developer as well as the sound card buyer. As far as interactive music systems are concerned DLS will most likely be a very valid candidate for the superceding of GM on soundcard synths.
Creative Labs; still the reigning master of PC soundcard sales, is still finalizing their "SoundFont" technology, which is a form of DLS but very scaled down due to the limitations of the small 512k RAM found on their "AWE64 Value Edition." The "AWE64 Gold" has 4 megs and with an AWE64 Gold card the sound of "SoundFonts" on such games as Dungeon Keeper by Bullfrog are very impressive. The system itself is being developed with Creative Labs' partner, the synth giant EMU Systems, however the capacity for effects and small instrument size make "Sound Fonts" a system more likely to be used by smaller multimedia applications producers such as Java applets. The conclusion can be drawn here that Creative Labs is an excellent manufacturer but has not yet found the pulse of the development end of multimedia music production while still maintaining appeal with the consumer.
If you aren't confused by now with the current trends in hardware development, you're doing a good job. The idea of a standard hardware playback system is difficult to imagine. To make development easier the thing to do is make a standardized software development system that works with all hardware, and this is why the IA-SIG is trying to come up with not only Downloadable Sample Standards but an "Adaptive Audio Engine" (AAE), one of the working groups in IA-SIG.
Instead of current methods, involving fading tracks in and out and jumping from music segment A to music segment B, this engine is currently in the first stages of design and uses special scripting methods in computer programs to change the state of music dynamically throughout an interactive application. It is still being discussed as to whether the music should be generated on the fly or composed in segments and then activated in various ways and combinations. It is also a question of whether the listener will be disturbed by or acclimated to constant changes in the music events. Thomas Dolby, of 80s pop music fame and now president of Headspace (a computer music application production company), is an active member of the IAE group with their audio engine development. But he isn't the only member of this group. The list is practically a who's who of the top end of the computer and console music industry, including Mark Miller from Crystal Dynamics, Alex Stahl of Pixar, Michael Land of LucasArts, Donald Griffin of Computer Music Consulting, and many others.
The Future of Interactive Music
To summarize, just where is interactive music? You've heard it if you've played the Nintendo 64 smash hit Super Mario 64, with tracks that were added and faded out as the player went from area to area. You've heard it also if you've played Fade to Black, a game by Delphine Studios / Electronic Arts', where tremendous tympani rolls would sound the moment the player approached an enemy. You'll be hearing it in Unreal, where both of these music transition types take place and more, but in future, despite all the confusion and difficulty in deciding how to organize your music composition in both hardware and software, it will be much more than that. Music for games will be not only interactive but totally dynamic. Whether it is scored beforehand or generated remains to be seen, but the future is certainly going to provide some interesting answers.