In creating our new VR project Séance: The Unquiet, we set out to evolve cinematic storytelling for virtual reality to deliver something as close to the experience of watching a movie as we could but using the medium of VR to its best advantage.
For a hundred years filmmakers have crafted every shot to tell audiences where to look: from framing and composition to lighting and depth of field, every technique of film has been used to better guide the eye to the most important elements on the screen.
What we’ve done with Séance is to completely reverse that: we have crafted each scene dynamically to be aware of where the audience chooses to look. It’s a fundamental evolution in cinematic storytelling that requires the expertise and technologies of videogame development to pull off. (In our case, we used the Unreal game engine.) From a grandfather clock that sounds different when you look at it to an intricately designed crescendo of music and sound design that rises dynamically in pitch and volume with the angle of your head as you turn to discover a ghost behind you, Séance shapes its experience to the audience’s gaze. The result is a new kind of storytelling for a new kind of medium.
In this series of blog posts I'll summarize these and other techniques of cinematic storytelling we have evolved and utilized in our project. If you want to see some of them in action, we have released a free five-minute preview of Séance for the Oculus Rift and the HTC Vive.
Our entire approach to VR cinematic storytelling is best embodied in the third scene of our Séance preview. As I break down the elements of the scene, I'll label them with the cinematic storytelling component they represent. I've also broken the scene into several phases for ease of comprehension.
Composition: This scene begins with the audience seated at a table, all alone in the great hall of our mansion. We expect the audience to be looking at our master shot as nothing interesting is happening to the left or right. The lighting has shifted since the previous scene but there are no characters or other items of interest except for one.
Physical Response & Immersion: A mysterious letter is on the table in the immediate foreground, but it is rotated away from the audience so the writing looks upside-down. In our testing we have often seen people lean over and crane their neck to try to read it, which is a great example of a physical response that increases immersion. If we just cared about them reading the letter, we would have oriented it for easy viewing -- but the immersion factor brought about by the audience struggling to read it is more important to us than the plot point represented by the letter's contents.
Spatialized Audio: We hear the narrator's voice for the second time in the preview. Both times, the sound emitter is up high in the cupola above the audience's head, but for this second time we reposition the emitter much lower so the voice is closer and louder to increase intensity.
Composition & Using Audio to Get Attention: At the very far end of the scene in the foyer, the double doors at the top of the stairs slowly creak open. The sound draws your attention first. Because the doors are so far away, they are quite small on the VR headset screen. The audience strains to peer into the darkness revealed by the doors but for now, there is nothing there.
Moving Audio Emitter: After the doors open, a powerful wind rises up directly behind the audience. This wind is on a moving emitter that gradually flies through the room towards the open doors. The movement of the emitter in the spatialized soundscape dramatically increases the audience's immersion in this moment as the wind really does seem to move through the space you virtually occupy.
Composition: As the emitter travels it intersects several invisible volumes placed throughout the scene which trigger animations. They start above your head as the closest chandeliers begin to sway in the wind. Then each successive window curtain along the wall in the Left Shot catches the wind and begins to flutter. More chandeliers start to sway as the wind moves through the room, drawing the audience's attention deeper and deeper into the Master Shot. But then in the immediate foreground, the sound of paper fluttering accompanies the letter lifting off the table and beginning an animated dance on the wind. Audiences typically follow the paper attentively as it flies up over their head, down the hall, catches a flash of lightning, and finally sails through those double doors that opened earlier. Still barely visible in that square of darkness, the paper dips twice out of sight and is gone. It seems to reappear a third time -- but it's not the paper anymore. Our next character, a ghost, has appeared.
Proximity: The ghost first appears in the far distance, so far that they are just an indistinct blob in your headset. That's intentional -- the ghost will eventually be right in your face and even pass through you, but her traversal of our space is key to the experience. The ghost passes through the balcony and comes to a stop at the edge of the carpet in the middle distance, close enough for clarity but not intensity.
Traditional Animation: We did not use mocap for the ghost. She was animated on a very simple rig and we went through many variations on her movement before settling on our final approach. Our lead animator, Travis Howe, explored movements including a Superman-ish flight style, a floating/drifting motion, a walking-through-the-air loop, and finally the more still and upright approach we settled on. To give this simple rig more life without a ton of hand animation work, we used NVIDIA's PhysX middleware on her skirt and positioned a wind source in the scene to cause the skirt to flutter and ripple.
Stylized Art Direction: Our ghost has the same art direction approach that character artist Charlie Baker devised for Col. Munro. For her materials, technical artist Stuart Cunningham gave her some transparency and also placed a blue light source that travels with her which causes the environment she traverses to glow with her radiance. He also created a disappearing effect which we'll cover shortly.
Spatialized Audio: Our sound designer, Keith Sjoquist, created the audio for the ghost which begins as a sobbing sound and then changes to a keening wail as she approaches. Keith built this in FMOD as a dynamic audio event that goes through its transitions from sob to wail based on the distance between the ghost and the audience. This was extremeley helpful during development as we made numerous adjustments to the ghost's motion and speed, yet her audio was always perfectly in sync and dramatic because it dynamically adjusted with proximity. In the end we shipped the preview with this dynamic audio in place but also augmented it with some static audio elements triggered at specific moments.
Proximity & Physical Response: After an unsettling pause at the edge of the carpet, the ghost looks up directly at the audience and then flies suddenly towards them. Even though audiences know the ghost is coming, they find the experience of a character rushing at them and then passing right through their body to be unnerving and even terrifying. We've seen many people lean back or duck their head to the side in an instinctive response -- and again, this physical reaction deepens immersion and completes the audience's transport into our virtual world.
Character Head Tracking: Because of the cinematic nature of the ghost’s animation we used a canned animation of the ghost starting down the hall and animating to a point near the player. From there, Stuart coded a blueprint that found the offset of the ghost’s head joint and using this offset he moved the ghost in engine directly to the headset. He also oriented the ghost’s rotation to make sure she will always end up facing the headset as well. This ensures the ghost will always meet the headset at the same point ensuring that she will ultimately lock eyes with the audience at the last moment. Here is a look at part of Stuart's blueprint:
Stylized Art Direction: At the same time the ghost is moving towards us she is also playing an animation on her material that makes her disappear, hiding her feet and lower body by the time she reaches the table to hide any clipping, and then disappear completely after she collides with the player. This effect Stuart created allows us to move her into position behind the player in a way that if they were looking anywhere we didn’t want them to be looking they wouldn’t see any “pop” in her visibility. Here is an animated GIF showing this disappearing effect:
We now come to the most significant moment in the preview and the one where all our techniques have the greatest impact.
The ghost appears behind the audience and waits there, wailing. The blue glow of her radiance spills onto the table. For the audience, there is no doubt the ghost is behind their left shoulder.
In a horror film, this kind of moment is common. The main character turns around and is shocked by the presence of a monster, ghost, maniac, etc. It's a big moment and all the cinematic tools of composition, editing, music, and sound design are deployed in a big emotional punch at the audience. But all those tools rely on a fixed duration for the scene -- what happens when the duration is variable because the audience can look anywhere they want and even turn their head faster or slower as they like? At that point, we need the dynamic qualities of videogames to deliver that same punch.
Proximity & Character Head Tracking: After she passes through the player we then move her into position to the player’s left, next to the table. We use the animated material in reverse this time to make her appear into view, again trying to avoid a visibility “pop” if the player is looking in her direction when she reappears. (Since we can't control where the audience looks, we do work like this to ensure everything always looks good and stays faithful to the context even when the audience behaves unexpectedly.) At this point Stuart's blueprint is again aligning the ghost with the headset to make sure she is always oriented to the player’s head position and we even move her up or down relative to the headset to always have her glaring down at you menacingly.
Using Audio to Get Attention: With the ghost in place, we use her sobbing sound again to draw the audience's attention. The ghost is clearly sobbing just behind your left shoulder, and it's up to you to look at her.
Using Attention to Drive Audio: Now that the player knows the ghost is back there, and can hear it wailing, we want them to turn their head and see it. In a movie these moments would be scored with a sudden crescendo of music timed perfectly with the visuals. Yet in VR the audience is in full control of where, when, and how fast they turn their head! To deliver the same cinematic moment we had to work dynamically. Keith built layers of audio for both the ghost's wailing and for the accompanying music that react to the angle of the audience's headset. As you turn your left to the left, the wailing rises in pitch and volume. The music does the same, and an additional violin element comes in about halfway through the rotation. If you stop and look back to the right, as many users did at SXSW, the sound ramps back down again. This lerping adjustment has a deliberate bit of lag to it -- if you jerk your head back and forth we didn't want it to sound like a DJ scratching a record. But if you turn your head normally, the music and sound design will perfectly align to your movement and pay off at the big moment: the scream.
Traditional Animation: When the audience turns their head far enough to see the ghost, her face contorts in horror and she screams. The scream animation was done by Charlie Baker in Maya using morph targets because we didn't have a complete mocap rig ready for this character. Because the moment of the scream is so brief, it looks great and is very effective.
Character Head Tracking: The ghost's screaming animation once again drives her material animation to make her disappear, leaving her right eye the last thing that we see of her. At every point that Stuart's blueprint has the ghost align with the audience’s headset we try to add some interpolation, basically some lag to smooth things out. Even in the small amount of time that the ghost is tracking your headset it would be noticeable to the audience if her turn rate kept her in complete lockstep with every head movement you make. Instead this lag minimizes the feeling that she’s just tracking your head as a game object and much more like a ghost is staring you down.
Proximity: Placement of the ghost behind the audience was also incredibly important. If the ghost was placed too far behind they would make attempts to look at her but would only look so far. There is an acceptable range that people are willing to stay in when looking for things to happen. If we placed the ghost directly behind the audience it would be uncomfortable for most people to look at given that we are a seated experience, and most would not even attempt to look at her. Finding the right placement for the ghost was critical, because we wanted enough time to build tension and give you that moment of discovery when you finally saw her, but we didn’t want the audience to lose her and give up looking. We found placing her just next to the table’s edge gave us the best results. Light spills on to the tablecloth leading you to turn and as you look towards the edge of the table you can already see her out of the corner of your eye. At this point it’s a choice for the audience to look at the ghost or to turn away out of fear. If they turn away, the ghost will eventually scream and disappear off-camera and we keep the story moving.
Stylized Art Direction: As she screams, Stuart's materials on the mesh dissolve rapidly and she disappears. He animated the materials such that the final visible part of the ghost is her right eye, which is the last you see of her as she vanishes screaming. We use this dissolve effect each time we make the ghost appear or disappear, including when she flies through you.
Here is the completed scream moment, with both the disappearance and the morph target animation:
Quadraphonic Sound Effects: Technically this one is triphonic, not quadraphonic. The ghost's scream is played through an emitter in her head, but stereo reverb versions of the scream are played from two other emitters positioned around her. The result is a sound effect that is both very positional but also envelops the audience to great effect.
Proximity: This big moment is what filmmakers refer to as a jump scare, for example when a cat leaps out of a cupboard towards the camera. We took a lot of pride in ensuring the audience knows a jump scare is about to happen, with all our cinematic cues of proximity, sound design, and music informing you of the imminent scare. And even with all of that, the jump scare terrifies people. We believe we played very fair with this scare instead of making it a total surprise and yet the work we put into it using all of the above tools results in it still being terrifying.
This is the last installment of our series of blogs on VR cinematic storytelling. We really hope this information has been both useful and inspiring. We've invested a great deal of time in thinking through these issues and iterating our work on the preview for Seance: The Unquiet. We believe there is a huge opportunity for game developers to apply their experience and skills in creating this new form of entertainment using the new medium of VR. Hollywood doesn't have these skills -- but game devs do. And as we've seen from the earliest days of film to the arrival of sound and color, then on to widescreen, Cinerama, surround sound, IMAX, and 3D, audiences crave more immersive entertainment.
As our art director Bruce Sharp likes to say, "VR is the most immersive technology since dreams." We are eager to see where all your dreams take you, too.