AGDC: A Generative, Adaptive Music System for MMO Games
Bay Area headquartered, John Romero-headed MMO developer Slipgate Ironworks has developed a robust system for its music, which audio director Kurt Larson considers the only viable way to move forward with MMO soundtracks -- which he presented with programmer Chris Mayer and composer Jim Hedges.
The presentation began with Larson presenting rough financials on traditional music creation for games. At $1,500 minute, most single player focused, 30-hour games would cost $45,000 to score. "$1,500 is not an unusual figure," says Larson, before pointing out that his EverQuest character has logged 70 hours over its play life -- which he jokingly noted would translate to $630,000,000 at that rate.
Instead of generating completed music, Larson proposes using "supplied composer sounds to generate music, completely unique. Think of it as wind chimes. Recognizable, familiar, non-repeating."
While he admits that the system is unsuitable for generating a "highly-structured intensely-composed warfare game opening" he believes it to be an extremely effective solution for ambient background music. "Like the wind chimes, it creates a mood, and supports the emotional experience you want the player to have. This is the best way to go about music in a massively multiplayer game. This is my message."
Generative Adaptive Music System
The system is called the Generative Adaptive Music System, or GAMS. With the efforts of one programmer (Mayer) and Larson, along with patient and enthusiastic contract composers who were willing to put up with a system that required debugging, the system was built quickly and cheaply. Says Larson, "I also wanted to come here with the good news that this was really, really easy. One programmer, one audio geek, and a few weeks of less than 100% of our time over the year to build a system."
But how does it work? Larson says, "It starts with the composer, of course, once the technology and tools are built. The composer gives information about how the sounds should be triggered, and he gives us the sounds." The developers then cross-reference the sounds with links between location, combat, weather, and other important information. Though it did not sound as if this system would be implemented in the game, even something like an "aesthetic value" for the character's armor could factor in. "Then we give it some high level scripting -- like the player's been here for 20 minutes, so why don't you just fade out?" The music that results from this process should be "the voice of the world".
Larson then displayed a theoretical zone set, starting with an opening zone -- zones here represented by simple fields of color. The music that played was ambient, as expected -- but cohesive and atmospheric. "What you're hearing will never be heard again. Every time you start the game you'll hear the recognizable music you can hear now, but it will not repeat." The way the tool is currently built, probability of hearing the exact same music twice is 280 years.
Larson doesn't see it as just a way to fill the background with pleasant tunes. "While you can do extreme changes, what I'm most excited about is the subtle changes." The demo moved to a rainy zone, in which the music is based on the same patterns but brings in new ones to reflect that aesthetic. "That guitar in the yellow [rain] zone never plays in the green [beginning] zone. Now, say I go from my rainy zone to a cave that I know to be dangerous -- then I can add in new sounds. And you can go all the way up to entering combat."
The music, of course, reflected these changes -- with intense drums and sharper sounds dropping in as the zone location changed. Says Larson, "Another advantage is that your transitions are guaranteed to be perfectly smooth. We don't stop the sounds that are being played -- any sound that was playing is allowed to finish, but we don't call them up again."
The Ease of Development
At this point, the presentation was traded off to programmer Chris Mayer, who asked the question "Why was this so easy?" His answer: "In trying to come up with the few things that would rise to the top, it surprised me that it wasn't really programming processes that made this so easy. The important thing is good communication." Crucially, Larson can read and modify Python code, "and by the same token, I did minor in music. Can you get a programmer to implement this who can't play keyboards? Yeah, but I don't think it will work as well."
Pulling apart the way the system works, Mayer notes, "The high-level stuff is very game-specific to our game, and would not be reusable. Everything else that actually decides what mood it's going to be playing and how it plays back, would be very reusable. There's also a third piece... that is the actual tool that a composer might need, who's going to be generating this data." A third of the development time was required for each piece. "The good news is that two-thirds of the work we did was completely reusable."
The system does more than marry a bank of sounds together to create vaguely random music. Time, pan, and pitch are randomized, not just the repetition of playback. Since fully random pitch would produce unusable music, there's associated data that monitors which sounds can be played in which keys, to ensure a good sound.
"The next thing that made this simple, is that it's really simple. The design was so simple at first that I was almost convinced that this would not work." But it could be even easier, Mayer suggests. Their middleware solution used Python, a different language than their tools group is using -- C# -- and "mixing doesn't work that well."
Demoing the Tool
The presentation was turned back over to Larson, "If anybody else wants to build something like this, you're going to think very carefully about who's going to compose on this. You need to find somebody who's into this stuff -- somebody who's into non-linear music, and who's technical. I wouldn't feel comfortable throwing an essentially unfinished tool at who didn't feel comfortable with that situation."
Noting that "anybody can use this system by writing a bunch of XML", Larson explained that to make it even easier to use, "we built this simple tool. All's you gotta do is open an FMOD [music middleware] sound bank, and now this dropdown will be populated with all of the events in that sound bank, and now you've got the music."
Demoing the process, he dropped in samples and created a workable, if not particularly pleasant, piece of music in a few moments, "You see how long I worked on this -- maybe six seconds -- it's not pretty, but it's playing music. I wouldn't be comfortable submitting this for a Grammy, but it's as good as the music I've heard in some games." The important effort, then, is the audio design -- the composition of the sounds up front.
Noting that the recently-released Spore also includes adaptive music, Larson quipped, "The main difference between Spore and this is that theirs is vastly more ambitious than ours, and that their music is a completely different type of music than ours. They had 12 people, six years, and Brian Eno. We had two people."
A Composer's View
Larson turned the presentation over to independent composer Jim Hedges, who collaborated on the project and created the samples (and thus the music) used in the presentation -- and in the final game, when that's released.
Hedges began by asking, "What makes [this style] different from traditional styles of composition? This is different than traditional game music, but I would add this is different than [other] adaptive game music -- those are based mostly still on a linear form. This is really completely different... because we have control of all of the relationship of these events in time, and the vertical dimension."
This tool is not just about generating a lot of music cheaply -- there's a philosophy behind it, too. "We don't want the music to tell the player how to feel -- we want to create a sound canvas that hopefully supports how the player is feeling."
In the system, "Musical elements which are used to produce a traditional melody are separated from each other." Certain elements are relatively fixed and other elements are relatively free -- it's about how the music is structured. For example, the time signature of the music was part of the recorded source samples. "You're trying to produce generative music that doesn't go completely off the chart."
Demoing his compositional process, Hedges showed the sheet music for several drum samples used in the game; each has a main sample and several variations, all with the same basic pattern and time signature, which are then layered to create the music. He assigned all the drums to the same basic rhythm. Hedges says, "Each drum occurs just once on its assigned beat -- it's always playing the main beat and that's always accented, but you get additional beats. These each correspond with a one-measure wav file I created."
To further explore the depth of sound he could create, Hedges took a cymbal sound and, though he at first considered simply playing it periodically, he changed his mind, got more ambitious, and pulled out different slices at different frequencies. "Every time the event is triggered, you get a new combination" of the sound, which sometimes is complimentary, sometimes less so -- but it's always different.
Hedges then showed how the combat music was slowly built up over his compositional process. "This is three different drum sequence patterns that are cut up into different groups. They're all being played at the same tempo and being triggered on eighth notes or quarter notes. They're playing very consistently."
The first demo, with a few layers, only had the tribal beats, but for the second he added in some more orchestral percussion, for contrast: "You get this more wild kind of thing going on." The final version, with all sounds, has more instrumentation -- it's still very percussive, but there are orchestra hits to add an accent. "I had to decide which sounds were kind of going to sit well together."
Larson returned to the fore to discuss pitfalls of the process -- which have not occurred yet, but he can easily imagine. "I keep worrying that my composers are going to come back to me and say, 'You changed my music with your crud, and that's my name.'" A potential legal issue is that the developer needs to take possession of the original wav files from the composers -- "And you know what kind of legal problems we're running into with that."
Moreover, addressing audio professionals at developers, he believes, "You're going to have trouble selling this to people who are trying to ship the project on time... you're going to have to plan for that, and evangelize on your team." But Larson thinks that it's merely a matter of how you pitch. "I think we get less success as audio people saying, 'This is going to be so cool, give me time, money, and people.' We have more success saying what problems we solve, and I am killing the problems of [players] turning off your music, and killing your audio budget." Prior to joining Slipgate, Larson polled his EverQuest guild -- 100 people -- and discovered that 94% had turned the game's music off. "That's the problem I'm trying to solve."
Questions and Answers
One composer in the audience asked how Hedges billed for the project, clearly worried that it would sideline the income of composers. The answer? "Man hours. This first round of assets bore it out -- it was relatively the same amount of time, a comparable amount to linear media that I was paid." This was in line with Larson's expectations.
Another audence member asked about the potential impact on ancillary sales of music-based products, like soundtracks. "You just record it. It's really that easy," suggests Larson. Meanwhile, Hedges "would be happy" to create definitive versions of tracks that would be played in a linear fashion, if that were required, for a soundtrack project. Larson also notes "you should make a recording for your own use" as a composer -- for reference and for portfolio purposes.