Using VR audio to bring a sense of scale to pint-sized puzzler Moss
Polyarc's virtual reality puzzler Moss has been turning heads for all the right reasons.
For starters, its a game centered around an adorable swashbuckling mouse (how can't you fall in love with that?), and it also happens to be a rich, engrossing virtual reality experience that avoids familiar pitfalls such as finicky controls and sickness-inducing scenes.
The acclaimed title won over critics by deftly combining 1st and 3rd person gameplay to create a unique puzzle platforming experience that's more engrossing than a game about a buccaneering marsupial has any right to be.
While it's been pulling in the plaudits for a number of reasons, particular praise has been heaped upon the title's clever use of three-dimensional sound alongside a captivating score, which both serve to drive the game's narrative and imbue proceedings with a palpable sense of scale.
Intrigued by how the Polyarc team achieved such a feat, we caught up with the studio's audio director Stephen Hodde and the game's composer Jason Graves to find out how they breathed life into the virtual world Moss calls home.
Gamasutra: Did you have any expectations about how VR composition and audio would work before you started, and how did they stack up to the real thing?
Stephen Hodde: I had an expectation that VR development was going to be vastly different from conventional games. And it is, to a degree, but it’s an additional layer of consideration, not a fundamental re-ordering of priorities. Whereas non-VR game audio is a mix between interactive and film conventions, VR sometimes requires audio to work more like hearing does in the real world.
I'm curious to how the shift to VR affected your creative process? Did it force you to rethink or tweak your general approach?
Stephen Hodde: Maybe it was just switching to a new medium, but there were fewer assumptions and foregone conclusions about "the right way to do things," if that’s even real. I surveyed a handful of VR developers that I respect and there were a wide range of opinions regarding best practices, with no apparent consensus. So I adopted a mindset that I don’t know what the "right way" is, and that predisposed me to thinking about problems with the game’s needs and player experience over convention. I suppose this is not a uniquely VR state of mind.
Jason Graves: For me, Moss is actually the fourth VR game I’ve worked on. The first one in particular, Farlands, had a binaural mix for the entire score. That is, the music was composed and implemented into the game to sound 3D and spatial - coming from a set of points inside the VR world.
For this particular game, the fact that it was in VR didn’t have a whole lot of influence on the music. We all agreed from the beginning the music needed to have a more traditional role and not call too much attention to itself. The audio director, Stephen Hodde, and I worked hard to integrate the music as seamlessly as possible. We wanted it to feel like it was part of the story.
Moss isn't a conventional VR effort in that it blends third and first-person gameplay. Was that something you had to consider when working on the score and audio?
Stephen Hodde: Absolutely. This blend manifested in the choice of perspective and scale of the sounds. I found that the more I shifted the sound to reflect how Quill might experience it, the more I understood the stakes of the narrative and wanted to keep her safe from danger. Practically, this moved the scale of all sounds larger relative to their geometric size and they took on an emotional tone that reflected Quill’s emotional state along the journey.
Jason Graves: Most definitely. I loved the whole 1st/3rd player combination for Moss. That decision was made primarily so the player could feel a connection to Quill, Moss' tiny mouse protagonist. It was important that she saw you as a character in the game and the music needed to emphasize that emotional connection.
"I adopted a mindset that I don’t know what the 'right way' is, and that predisposed me to thinking about problems with the game's needs and player experience over convention."
What was the biggest VR-related technical challenge you encountered during development?
Stephen Hodde: There is a lot of promising technology for audio that is designed to mimic hearing. Binaural rendering is perhaps the most widely talked about technology. It performs processing on sound to model how the head changes sound when it passes over your face and into your ear canal, which the brain interprets and produces more exact directionality. You’ve probably heard now of head-related transfer functions (HRTF) on a per-object basis, and some binaural technologies emulate Interaural Time Difference (ITD) or time of arrival delay between ears. Oculus, Sony, and Valve are all doing some truly amazing things in this arena.
The challenge comes when attempting to use this technology as an effect, sparingly, and as transparently as possible. It’s not always easy using this technology side-by-side with more traditional methods of spatialization, from a user experience perspective.
Moss is not a game that requires a lot of pinpoint accuracy when judging the origin location of a sound. However, it does require a lot of emotional, full, bright sound for its transportive effect. The processing required to make sounds more accurate (i.e. HRTF) and the pursuit of transparency are ideals totally at odds with each other. If there is some amount of real-time binaural processing to the audio signal that occurs, sometimes an essential quality of the sound is lost. So there’s a tension and a choice for each sound; how directional should it be versus how open and free of intervention. In this way, some of the old-fashioned spatialization methods are actually preferable to newer technology.
What did you set out to achieve when you started out, and how did you know when you'd finally got there? Did the score/audio evolve much throughout the process?
Stephen Hodde: The goal was to support the story, so we started out with a simple emotional arc that mapped to each section of the gameplay. In some instances the music helped inform emotional tone of the game, and so the team at Polyarc was listening to Jason’s work and responding to it by changing game content. There was a lot of effort to provide Jason with as much context as possible: screenshots, video captures, scripts, world building documents, concept art, and so on.
We talked a lot at the beginning about balancing feelings of intimacy, that it’s simultaneously small and large feeling, and what that might sound like from an orchestration perspective. And then Jason just went with it. He’s such a pro and sort of a magic antenna of creativity, that he nailed it every time. If something didn’t fit like a glove, it could always be moved around to someplace else. In the end we used 100% of the music he wrote.
“Finally getting there” was about monitoring our own reactions and getting feedback from the studio and players. For me this is mostly instinct-driven.
Jason Graves: My musical goal was the same as the developer’s goal -- to make a game that would create an emotional bond between the player and Quill. The music was slightly different in the beginning but in general the scope and sound of the score remained the same.
"It’s really fun to feel like you’re on the front lines of a new medium, and it can be accompanied by a sense that there’s no clear way forward, which can be liberating."
What tools and tech did you rely on? Did you have to bring some new toys into the studio to support your efforts in VR?
Stephen Hodde: We used Unreal Engine 4 and Wwise, along with Sony’s spatialization technology on PSVR. The mix you hear is predominantly 3rd order ambisonic.
One of my favorite sounds was Quill’s heartbeat when she’s injured, which I captured using a fetal doppler monitor. That classic sonogram heartbeat sound is an inherently empathy-and protection-inducing sound.
Jason Graves: The score for Moss was implemented in a traditionally interactive way so there was nothing new in terms of technology composing for VR. But I was able to dust off some of my favorite instruments! Many of them I’ve had for some time, thinking someday I would get the chance to use them on a project. Ukulele, hammered dulcimer and Celtic harp all came out to play for the first time, along with my accordion, acoustic guitar and percussive toys. I had the idea to use instruments that were small in size and sound to relate to Quill, especially since you experience how tiny she is in VR -- she really is mouse-sized compared to you!
When it comes to audio specifically, are there any distinct pros and cons to working within the realm of VR? And do you have any advice for other budding soundsmiths who might be looking to dip their toes into the virtual reality waters?
Stephen Hodde: Don’t let VR intimidate you. The tools have come a long way and working in Unreal Engine is quite fun and easy. Your skills will absolutely translate over. Think of it like a new game, so approach it as you would any new project and ask what it needs. It’s really fun to feel like you’re on the front lines of a new medium, and it can be accompanied by a sense that there’s no clear way forward, which can be liberating. You’re not tied down to the choices of other games.