Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Gamasutra: The Art & Business of Making Gamesspacer
Organizing And Formatting Game Dialogue
View All     RSS
January 27, 2020
arrowPress Releases
January 27, 2020
Games Press
View All     RSS

If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Organizing And Formatting Game Dialogue

November 18, 2005 Article Start Previous Page 2 of 2

Passive Format

Well, enough about them, let's talk about us.

Much game content is written in what I term the "passive format" (because it works best in situations where the player is a passive spectator, such as when watching cinematics), which resembles the format used by movies and television shows. This script is formatted in such a way that it streamlines the filmmaking process -- Courier New font, wide margins, use of all-caps, lots of white space -- it's easy to mark up during filming. See Fig.1 for an example. Script supervisors use the margins and white space to make necessary adjustments, and the large font and capitalized words make it easy to locate content in a hurry. This is a result of the way that movies are made.

Figure 1. Passive Format

In the game industry, the use of the passive format is less appropriate. Most games are nonlinear enough to render the passive format inappropriate. If we rely on the Hollywood rule of thumb that one page of script equals one minute of screen time, then theoretically, a twenty-hour game would require a script 1,200 pages long -- at which point a sane person says, “Hell no.” On the other hand, if the passive format is used to create non-interactive content, such as in-engine or pre-rendered cinematics, the passive format still seems like a bad idea. The pre-rendered cinematic isn't filmed in real-time, it's created over days or weeks, usually by a number of people including artists, animators, and at least one producer or designer. Therefore, a format created to facilitate rapid changes on the fly just doesn't fit the process of cinematic creation. The white space, the large font -- these serve no discernable purpose in this context.

However, if the intention is to create a text-based narrative (as opposed to a visual narrative structure, such as the use of sequential storyboards), then a modified version of the passive format can be of some use. By removing all of the marginal formatting, and by using a more efficient font (like 10-point Times New Roman), this format can be useful. See Fig. 2 for an example.

Figure 2. Passive Format Revised

Active Format

Oddly enough, an accounting spreadsheet can be a writer's most effective tool. I use Excel to keep track of my dialogue, as do many writers. It's particularly useful when preparing "active format" dialogue (any dialogue taking place in-game, where multiple variables can make it a challenge to keep track of all possible dialogue threads).

Taking into consideration all of the aforementioned dependencies and relationships, as well as the limitations that I've encountered when using passive format, I've come to rely on the following structure (see Fig. 3 for details):

Figure 3. Active Format

Before I address the individual headings and content, a few notes about the format:

I always highlight all cells with data and create a visible grid around them. If you're new to Excel, note that the cells with no text inside them feature a faint gray grid. This is visible when using the program, but doesn't appear when the document is printed (making it harder to read). This is the default setting, so you'll want to create the grid around any cells with content. See Fig. 4 for details.

Figure 4. Grid

By highlighting the cells across the top (Actor, Cue, etc.) and selecting Data-->Filter-->Autofilter from the menu, you can add filters to your headings. This will enable you to select specific kinds of data from the fields. See Fig. 5 for an example.

Figure 5. Actor Filter

If, as in the above example, your spreadsheet is arranged chronologically, it will feature various actors talking to one another. But if you want to isolate a single actor, you simply click on the small gray box in the Actor field and select that character's name. The same functionality can be applied to the other fields.

It's best to use the landscape format when setting up your page. This way, rather than print each row on multiple pages (Actor/Cue/Context/Inflection on page 1, Location/Area/Effect/Filename on page 2), you get the whole row across a single page. You can set this up by selecting File-->Page Setup/ Page-->Landscape. You'll see in the example below (Fig. 6) that the dotted lines indicate that the page will still be cut off at the end, and the spreadsheet will bleed over onto a second page. I'll need to adjust some of the row widths in order to get the spreadsheet to print on a single page.

Figure 6. Landscape


Now, onto the individual headings.

ACTOR: In this area, list the speaker. It's best to keep this as terse as possible -- don't abbreviate to the point of being incomprehensible, but a last name will suffice, if one's available. Make sure that you maintain consistency throughout the document. If you refer to a character as Jason in mission 2, don't start referring to him as Jason Caldwell or Mr. Caldwell in mission 4. You want to be meticulous in your search for typos in this field. If you misspell a name, then try to use the filter to select all of Jason's lines, it's not going to include “Jasson” in that search -- and any dialogue attached to that typo won't appear in the filtered search.

CUE: In this field, you type the actual spoken text. Keep the parenthetical notations out of this field, if possible. Save them for the context field. Here, the actor wants to see raw text, not text accompanied by "(sadly)" or "(yelling)."

CONTEXT: Here, you indicate the context for the line to the person reading the dialogue. It's best to keep this as terse as possible; chances are, the reader knows the basic setup (superheroes under attack in a bank lobby). The situation immediately preceding or prompting the dialogue is the issue at hand, and that's what you want to convey in this field. A.I. responses may also work in this column. For example, if your game plays a death scream when the “player_dead” A.I. state is invoked, then you may want to put "player_dead" in this column, along with a note for the voice actor who will be doing the screaming. Or, you may want to split this into two columns: one for the actors, and one for the developers who will be integrating these assets into the game (scripters, programmers, and so on).

INFLECTION: In this field, you indicate the emotional state to the actor. The primary use of this field (other than the obvious) is to keep the volume level consistent across the various cues. Voice actors can read over 200 lines in a single session, and that can take its toll on the vocal chords. If you want to get the most out of your voice-over, and if you want to do your voice actors a favor by making their jobs easier, you can group cues together by volume. For example, start with whispers, then have the actor deliver all conversational tones, then proceed to any yelling or death screams (it's always fun/creepy to hear a grown man shriek like he's being eaten by sharks). It's best to keep an eye on the number of individual Inflections. You don't need to get creative with your adjectives; "angry" is good, you don't have to describe the next cue as "furious" or "enraged" in order to avoid repetition. In fact, repetition is good, in that it's easier to use the Sort function (Data-->Sort) to lump all the "whispered" cues together, then the "normal" cues, then the "angry" cues. Just picture yourself as an actor, trying to guess the developer's intentions. Unless you're going to direct the voice acting yourself, do your best to give the voice actor a specific emotional state for the inflection. Any additional material should go in the Context field.

LOCATION: Here, indicate where in the game, the dialogue is taking place. This will vary, depending on what type of game you're working on. For example, in a wide-open game like Morrowind or GTA, you might indicate a type of environment (indoor shop). For a more structured game, like a mission-based shooter, you might indicate Mission 2, Tenement Building.

AREA: For the Area field, I always enter a number that can be sorted easily, or arranged chronologically without too much fuss. It makes it easier to answer questions if someone asks how many voice cues Character X has in mission 3. But, again, it depends on what kind of game you're working on, and how rigidly structured the game experience is. You may find the Area field to be superfluous, or you may add more fields to the spreadsheet. If your game is split up into multiple areas and levels and sections, additional columns may be necessary.

EFFECT: Here, I indicate any effects that need to be applied to the voice cue. This includes radio futz, echoes, distortion, and so on. It's something that can be filtered at the end of the process, and handed off to the sound designer or programmer, in order to streamline the process for him/her. It can also be used by QA to ensure that all applicable effects have been added to the game. And it's also good for voice actors, because there's a difference between yelling across a freeway at someone, and yelling into a radio. The additional contextualization can help during the recording session.

FILENAME: Once you've got a handle on how many voice cues you're going to be working with (dozens, hundreds, thousands), you can start to plan your naming convention. If you're lucky, there's a convention in place. Otherwise, you may be the person responsible for creating one. I've found that the best process is to create a convention that A) is easily sorted in chronological order, B) tells the reader where the cue appears in-game, and C) leaves room for the additional cues that inevitably get recorded in pick-up sessions.

The single biggest advantage to this process is that it accounts for the nonlinearity that drives pretty much all gameplay. The passive format reads like a movie script, and is suited to a linear experience such as a cinematic, but it's just not well-suited to in-game dialogue.


There's not much left to say except that the driving principle behind all of these examples is: communicate, learn, plan, organize, and execute. By familiarizing yourself with all the moving parts, you're more likely to create the emotional, story-driven experience that we're all aiming for. In order to protect the integrity of your story and characters, you need to anticipate the myriad complications that arise during development, and you must be flexible and creative in your solutions to these issues.

Best of luck.



Article Start Previous Page 2 of 2

Related Jobs

Embodied Inc.
Embodied Inc. — Pasadena, California, United States

Jr Performance Designer
Deep Silver Volition
Deep Silver Volition — Champaign, Illinois, United States

Principal Writer
Running With Scissors
Running With Scissors — Tucson, Arizona, United States

Level Designer
Sucker Punch Productions
Sucker Punch Productions — Bellevue, Washington, United States

Gameplay Programmer

Loading Comments

loader image