[Martin Herink is a University of Oxford graduate student, and a freelance gameplay designer for Cotopia Wireless. In this editorial, he delves into the role of video game cutscenes - not from a standpoint of structural appropriateness, as is often discussed, but from a narrative and cinematic point of view.]
The often discussed "problem" of the cutscene, at least in so far as it relates to the ongoing debate about interruption of interactivity and narrative exposition, rather than being a problem of structural functionality, appears instead to be a question on the propriety of decisions made by designers regarding issues of perspective relatively easily resolved in the framework of editing, rhythm, and aesthetic expression.
This rather cryptic statement means merely to suggest that instead of deciding what is appropriate to video games strictly in the terms of interactivity vs. narrative, designers and artists should instead review their understanding of the film form in the terms of editorial rhythm, aesthetic expression and, above all, cinematic tact.
The Question of the Cutscene
If the use of the cutscene means an injection of purely contemplative material into a fundamentally kinetic experience, and if the game is meant to be an interactive experience, should we avoid the risk of offending the player's expectations with non-interactive content by abandoning the cutscene completely, as some suggest?
The games industry has not yet abandoned the cutscene. Instead, it has largely focused on the transformation of the cutscene into a kind of narrative experience which arises from a perspective external to the player (through NPCs, sound clips, in-game sequences, etc.) This is for a number of very good reasons.
The cutscene has always been, and will continue to be, a useful aesthetic tool for the expression of the story and of the narrative. The problem of the cutscene as such is not its existence inside the interactive space of the game. Players, in fact, appear to enjoy the additional layers of immersion which it can provide. The problem with its use ought rather to be viewed as the inappropriate sequencing of what is a fundamentally contemplative medium within the context of an inherently kinetic one.
These are two very different actions, and while both have an inherent value on their own, the method of their integration requires both an understanding of rhythm, pacing and aesthetic tact. This is in fact the concept of editing transposed on to the interactive medium of the video game.
Rhythm and Pacing
Even in its most classical state, the cutscene can still be effectively integrated without causing the kind of perspective disjunction that forms the basis of criticism against it. On the most basic level, the designer need only be careful that the cutscene does not inappropriately intersect with sections of kinetic action.
To elaborate this idea on a very basic level, if the designer creates specific gameplay goals which the player must achieve in order to realize narrative progression, it makes very little sense to abruptly interrupt the expected flow of kinesthetic events with non-interactive content.
To interrupt a conversation, as many of us have been taught, is rather a rude way of joining it. This in a sense is also what might be meant when we talk about episodic developments in gameplay: an episode of gameplay may be followed by the contemplative engagement of a cutscene.
Max Payne 2 is a superb example of the way that episode-based editing can lead to a positive integration of the cutscene. What MP2 did was attach a narrative framework with multiple character perspectives to mission-driven gameplay. Each mission was limited in length and offered in a variety of settings (from hospitals to construction sites). In order for the narrative sequences to be triggered, gameplay goals had to be met.
Even more importantly, not only was the goal progression rewarded with cutscenes, but the cutscenes would also set up the transition into different settings and places. The player could skip the cutscenes but watching them (at least the first time around) really made the game experience something that you could care about.
Like a noir puzzle film, each viewing offered new layers of understanding not only Max Payne as a character but also the complex world of which he was a part. To this day Max Payne 2 is one of the few games I've not only completed repeatedly but in which I've actually watched the cutscenes repeatedly as well.
Indeed, the pattern by which such episodic involvement occurs might not simply follow the familiar structure of gameplay, cutscene, gameplay, cutscene, but the switch between the two can also be done in different arrangements as long as it is executed in such a way as not to interrupt the player's involvement in either.
This is especially the case when the level of interruption is restricted to the game's internal world. This is what I mean when I speak of maintaining appropriate rhythm and pacing.
What I mean by aesthetic tact can largely be divided along two lines of criticism. First, our respect for the player, second, our respect for the way the player is asked to engage with the game (either by affecting it or by contemplating it)
Returning to the example of Max Payne 2, what the game did not do was interrupt actual sessions of game play with cutscenes prior to goal completion. Even when the designer decided to throw in a narrative twist by making the meeting of certain goals actually work against the player, the player never felt that his effort was being undermined, because it was merely refocused in terms of the game's narrative sequence not in terms of game play itself.
By separating the two activities, appreciation and involvement, the designer created an experience that not only had an appropriate rhythm but also respected the player's role in moving the game and indeed its story forward.
On the other hand, Shenmue - a game which many (including myself) admire for its innovative approach to cinematography is at the same time one of my favorite examples of instances where the contemplative and aesthetically rewarding experience of the cutscene had been rudely interrupted (often without any warning) by a scripted time-action event.
In the long run, innovative as it might have been, the uncertainty of when such imposition might occur next had always left me feeling uncomfortable with fully enjoying the drama of the visual sequence.
I call it a "visual sequence" rather than a cutscene, because once I had been prevented from relying on these sequences to be cinematic and completely contemplative, they were no longer anything more than visual sequences that may at some point in time become interactive. This is not the kind of pacing conducive to respecting the player's relationship with different kind of sequences in the game.
As has been pointed out to me, this style of aesthetic integration has remained relatively popular having been adopted in the very successful God of War series and a number of other games that have followed in its steps since. An obvious counter to my summary of the trigger event therefore might be that it is just this - a personal preference.
My discomfort comes from the fact that I felt tricked into believing that what I was experiencing was a cutscene (and my expectations of a cutscene are of aesthetic engagement) while in reality I was partaking in another style of (albeit less directly responsive) gameplay, which allowed me neither to fully interact with the environment of which the character was a part, nor to simply appreciate the drama on the screen.
The lesson in terms of rhythm and cinematic tact (at least in so far as I would suggest it to be) is therefore this: just as with an unexpected interruption of gameplay by the cutscene prior to the completion of a goal understood either implicitly or explicitly, it is no more appropriate to interrupt the contemplative nature of the cutscene with gameplay, without either allowing the player sufficient time to switch modes or at the very least reaching some sort of closure on why it is happening.
The final point of contention in regards to the cutscene is the change in aesthetic definition that a player experiences when a cutscene has been fully pre-rendered (and is not therefore a part of the in-game sequence). In the case of pre-rendered scenes we must also take into account the perspective (point of view) through which the player occupies the game.
In addition to utilizing appropriate pacing and tact as already discussed, games that want to exploit the way that pre-rendered cutscene surpass the game engine's ability to render sufficiently impressive visuals internally must also take into account the proximity between the player's point of view and that of the game's.
I would first like to note that the pre-rendered cutscene has been (and indeed continues to be) used very effectively in the third person perspective - perhaps more so than any other genre of play where the interface is directly responsive.
Look for example at the increasingly elaborate cutscenes in games by Blizzard, and the now inactive Westwood Studios. Without involving the question of the financial cost of such sequences, if we, nonetheless take these examples as our guiding point, we'll quickly notice that one way of addressing the issue of differences between visual definition is to distance the narrative content of the pre-rendered cutscene from the game play sessions themselves by making sure that the cutscene isn't simply mirroring the game play sequence or its immediate perspective.
This is most formidably the case with Diablo II, one of a number of games that I believe fundamentally appreciated not only the editorial rhythm but even more importantly the distance required to accomplish the integration of a fully pre-rendered and visually impressive cutscene. The reason for this is that the distancing between the completion of an episode of gameplay from the contemplative content of the cutscene was sufficient enough to make them a kind of welcome (but not enforced) reward.
Interestingly, in the case of Diablo II, the use of the cutscene in relation to the narrative was also further removed from the kinesthetic relationship that develops between the player and the avatar through the use of the parallel storyline (that is, a pre-rendered cutscene which does not concern the player or the avatar immediately, but which instead follows other characters and shadows the events of the game's dramatic progression through an external perspective).
The other and even more popular example of cut-scene use can be seen in the case of Square Enix games, especially the Final Fantasy series. The examination here is somewhat more problematic because unlike many other franchises, games like Final Fantasy rely heavily on genre guidelines firmly established by one or two companies during the mid-1990s console warfare.
As most people are aware, the introduction of the CD-ROM meant that content could now be streamed rather than generated on the fly, and impressive pre-rendered sequences became an easy way of selling already impressive gameplay to large but nonetheless niche markets like that for the RPG. The Square Enix model, to this day, relies heavily on such sequences.
I would argue that it is no longer because it's an easy way of improving in-game visuals, but also largely because the marketplace has come to expect it. Like digital animation in Pixar films, the improvement in quality of pre-rendered graphics must in many ways continue to improve in the context of its own marketplace simply in order to outdo the previous generation. This is especially true when there are only a few big players in a given segment of the marketplace.
On the aesthetic front, Square Enix games still predominantly follow the same guidelines that have already been set out, and in this way they also respect the player's expectations of the game. As a reward, the cutscene is predominantly a question of of rhythm and pacing, allowing the narrative to move forward on visual and contemplative terms; an episode of gameplay will be rewarded with a cutscene and the expectation is therefore fulfilled.
The First-Person Shooter: The Trickiest Of Genres
In the case of the first-person shooter the inappropriateness of interrupting the kinetic proximity which develops between the player and the avatar is the most direct, since the FPS in effect directly aligns the player's own view of the immersive world with that of the character.
This, quite likely, is the reason why designers are increasingly opting for sound bites and scripted NPC action (as in Half-Life 2) as opposed to pre-rendered sequences (at least once the game play has begun). Once again, the use of the cutscene isn't on trial here; the designer must be, more so than elsewhere, painstakingly careful when considering the methods by which the cutscene is integrated.
In the FPS genre, there is the additional problem of aligning the cutscene's narrative function regardless of the method of integration, too directly with the player's own perspective resulting in the breaking of the fourth wall. Here the inappropriateness of disabling the player from engaging in the action which occurs on screen is amplified in terms of making evident the player's secondary status in the diegetic world of the narrative.
Games like BioShock put this problem of self-consciousness in the foreground on predominantly postmodern terms, by effectively rubbing it in the player's face. The danger of such approach is, much like the danger faced by nearly all other postmodern aesthetic movements inclusive of cinema, the collapse of its novelty. You can only rub the limits of the genre's structural foundation in the face of your audience so many times before they find some better way of entertaining themselves.
It is also interesting to note that in the examples specifically mentioned here (both BioShock and Half-Life 2), there is always a time gap between the conclusion of heavily engaging game play (fighting, moving items around, getting out of negative situations) and the occurrence of the scripted cut-scene event. This is analogous to the good sense rules of rhythm in showmanship: a good band will rarely play a gig where the fast song is immediately followed by a slow song without any kind of temporal transition.
This is what good pacing means not only for the FPS but also for other predominantly action-driven games: It is innately difficult for us to move from a kinetically engaged and often an adrenaline-charged state to an aesthetically contemplative state without being offered the chance to calm down first. Without at the very least an indication of such a transition being near, the designer risks ripping the player out of the action of game play.
Interplay's now defunct Descent series addressed this problem by associating the exit door of a given level with the immediate triggering of the cut-scene sequence. Having played one or two levels, the player quickly learned what to expect and could then breathe a sigh of relief when the open exit door was in sight, leaning back to watch the cutscene a they entered it.
A Brief Note on Games Without a Directly Responsive Interface
There's a big difference between these sorts of games and point-and-click games, or any other scenario where the player's involvement with the movement of the on-screen space is not quite so directly responsive. The reason why I consider this scenario to be fundamentally different arises from the fact that the cutscene can, and often should, be inserted at a variety of different points without requiring the kind of pacing or tact that the more immediate interface would call for.
Since the player's involvement with gameplay is far less direct (and therefore the perspective is by its very definition more contemplative), the experience isn't anywhere as jarring and the same sense of rhythm, pacing and tact isn't necessarily required. Similarly, the kind of expectation with which the player enters the game space is less engaged in play as a kinetic process.
My goal in writing this article was to redirect the conversation from the consideration of the propriety of the cutscenes as such, to a more holistic approach to video games as a cinematically conscious art form.
In this sense, if we are to stake any claim to games as a medium that can utilize cinema in exciting new ways, it is important for us to first decide on the terms upon which aspects of cinema can be utilized in order to create aesthetic experiences specific to games as both a contemplative and a kinetically engaging medium.
By following a few simple guidelines regarding rhythm and pacing but above all by respecting the player's experience of the narrative game not as a singularly kinetic experience upon which we must force the narratological semblance of cinematic tact, but rather as an experience that shifts and morphs according to our relationship with each immersive moment, the long history of narrative cinema can offer us novel new ways of engaging the player not only in play but also the immersive world of which such play is a part.