Playing by Ear: Creating Blind-Accessible Games
By Gavin Andresen
May 20, 2002
you ever played a game with a configuration option to turn off the graphics?
I'm not talking about an option to turn down the level of detail or switch
off textures, but to turn off the graphics completely?
How many games have you played with options to turn off the sound?
Most people can't imagine playing a videogame with no graphics - even the name videogame indicates that they're a visual activity. At Zform, we've decided to be different from most game companies. We're developing games with parallel graphical and audio user interfaces (GUIs and AUIs). In our case, we're doing it because we want to bring the excitement of online multiplayer competition to visually impaired people around the world.
There are over 7 million people in the U.S. that can't see well enough to read this magazine article. Many millions more need to find their glasses to read it. The percentage of the population that has trouble seeing is getting larger every year as the baby-boom generation ages. If you'd like to sell your game to the largest possible number of people, you should think about using audio to reinforce the information you present graphically.
Another area where audio interfaces shine is on nontraditional gaming platforms such as mobile phones or PDAs. Perhaps the next blockbuster gaming platform will be audio-based games running on portable MP3 players. After all, MP3 players have all the requirements of a good gaming platform: lots of memory, a fast CPU, high-quality stereo sound, and several buttons for user input. The lack of a high-resolution color display shouldn't impede a creative game designer.
So what if you're not creating games for visually impaired players? Even if you are creating another first-person shooter with a target demographic of able-bodied 18-to-34-year-old males, you should still consider using audio for more than just gunshots, grunts, and death screams. No matter what type of game you are creating, paying careful attention to the audio user interface and 3D audio environment will enhance the player's experience.
In this article, I'll be describing the techniques we use to create an audio user interface for a first-person 3D game we're developing. Our goal is to create an interesting, compelling 3D environment in which both blind and sighted players can compete as equals.
We decided to use the Quake 1 engine as our technology base, for several reasons. First of all, older technology is great because it runs great on older machines. Blind folks usually don't have the latest and greatest PCs with state-of-the-art sound and video cards. Our target system is a 200MHz Pentium with VGA graphics and any DirectX 7-compatible sound card. The second reason we chose Quake 1 is because it's open source. Kudos to id Software for making it available as a starting point for innovative projects. Finally, we have the source code. We knew that no matter what engine we chose, we'd have to make lots of modifications to the audio and navigation code to create a blind-accessible game.
All of our audio is created in 22kHz, 16-bit format and played back in stereo via DirectSound. We assume that our blind players can hear stereo sound (that they're not deaf in one or both ears).
2D Audio Interface
Our first task was to make all of the introductory menus and text both audible as well as graphical. A little bit of programming extended the menu and option GUI to play back arbitrary sound files, instead of making the generic Quake "clank" sound. It was simple to record somebody reading each of the menu entries so that each entry is identified aurally when selected. Some of the game options were trickier than others, such as entering an IP address to set up a multiplayer game, but none was too difficult.
One simple rule we followed that many other games do not was to make narrations interruptible. This was especially important for our audio menus; it's no fun to listen to six options play back when you know you want the seventh.
Speaking of narrations, another thing we did that was very effective was to use the game's main character voice for all of the game's menus. Our main character, Momo the monkey, has a distinct, silly accent. Using Momo's voice for the initial game-setup menus was a great way to introduce the player to Momo and to set the right mood for the rest of the game.
Navigating by Ear
Giving players enough cues to let them know where they are in the 3D world, but not so many that their ears are overwhelmed with sound, was our biggest challenge. There is a lot of information to convey:
pieces of information to convey audibly tend to be navigation issues that
are nonexistent or trivial in a graphical interface, such as the location
of the exits. For example, a simple solution for making exits easy to find
in a graphical user interface is to make them look like familiar exits in
the real world (such as doors and corridors).
Exits. Initially, we installed doors at all of the exits of each room in our level (gameplay occurs in an indoor environment). Our doors were of the electric Star Trek variety, automatically sliding open as you approach, so it seemed natural to have them emit an electrical humming noise. The idea was that when standing in the center of the room, you would be able to identify the exits just by the noises they were making; if you heard a hum to your left, you would know there was a door to your left.
That implementation was a complete failure. It was difficult to navigate out of any room with more than one exit, unless you cheated and peeked at the screen. If you stumbled around long enough you'd eventually hear a door make its opening "whoosh" sound, but even then it was hard to navigate through the doorway.
We tried several variations - we gave each individual door a slightly different sound, created doors that beeped instead of hummed, and tweaked attenuation parameters so you didn't hear doors until you were fairly close to them. None of them worked very well.
We finally realized that the doors weren't really what the player needed to hear. Blind players actually needed audio cues to navigate into the room or hallway that lay beyond the door. We ripped out the doors and installed air conditioning vents - point audio sources with a pleasant hum - into the center of each hallway, adjusting their volume and attenuation so that they could be heard down the length of the entire hallway and slightly into the adjoining rooms. We also modified the game engine so that walls occluded point audio sources.
After making those changes, finding exits by ear became easy. When you hear the air conditioning noise, you have an unoccluded path into the hallway. You then rotate yourself until the noise is coming from directly ahead (so it has the same volume in both ears). You can then simply walk forward into the hallway (see Figure 1).
This point source audio entity, with an attenuation radius of r, cannot be heard by player 2 due to occlusion.
technique can be used to make it a little easier for sighted players to
find hidden passageways and rooms in a level. Observant sighted players
will notice a quiet hum coming from their left as they walk by a hidden
exit and will use the audio to navigate their way into the hidden area.
This is one case where blind players would have an advantage; hidden passages
would sound the same as any other exit.
Footsteps, bumps, and scrapes. As in many games, we generate footstep sounds as the player walks. They are a tried-and-true solution to the problem of giving players feedback on whether or not they're moving and letting them know how fast they're moving. In fact, it would have been more work to make movement silent, since footsteps are hard-coded into the Quake 1 engine.
We have, however, extensively modified the audible movement cues to make blind navigation easier. Originally, the game engine played a simple "ugh" sound when the player walked into a wall. Sighted players can easily see if they've walked directly into a wall and are stuck, or walked into it at an angle and are sliding along it. To give blind players the same information, we adjusted the stereo pan of the "ugh, I ran into a wall" noise based on the angle at which they ran into the wall. If you run into the wall with your left shoulder, you hear it in your left ear; walk straight into a wall and you hear it equally in both ears. We also added a scraping noise to indicate that a player is moving forward but contacting the wall. The scrape sound is stereo-panned in the same way as the bump sound.
Supporting artificial stereo panning did require us to modify the game engine's sound code. We added an extra floating-point parameter (balance) to the play-a-sound function. Zero is the normal setting, which plays the sound using the standard 3D stereo spatialization algorithm. A value of -1 results in the sound occurring completely in the left ear, and +1 results in a sound completely in the right ear. Of course, values in between pan the sound from left to right.
Getting oriented. Letting players know which way they're facing is always a challenge when you allow unrestricted movement in a 3D world. As in many games, we simplify the problem by restricting movement to 2D movement along a ground plane. Therefore, the player's orientation can be described in north/south/east/west terms; players can't fly up and down. Even on a 2D plane, however, after a few turns down a series of passageways it is easy to lose track of which way you're facing. We've found that some of the same techniques help both blind and sighted players keep themselves oriented.
The most basic technique is to simplify level design. Unless getting lost is part of the game, avoid creating a maze of twisty passages, all of which look and sound alike.
Another technique we use is to build a consistent orientation cue into the 3D environment. For sighted players in an outdoor environment, that might be moss on the north sides of trees, or clouds in the sky that always move from west to east. In our case, we modified the hallway air-conditioning noises so that north-south-oriented hallways make a slightly different noise from east-west hallways.
We also bound a keyboard key to an audible compass. Pressing the key announces which way the player is facing, rounded to the nearest compass point (such as "northwest" or "south"). Implementing the audible compass was much easier than a graphical compass and gives the same information.
Audible objects. Besides exits and walls, the other objects that sighted players can see and that blind players need to hear are the other players (our game is multiplayer) and any object that can be picked up, be poked, or otherwise affect the gameplay.
The other players are easy to hear, since they're making footstep, wall-bump, and scrape noises as they walk around the level. We do implement special code so the artificial stereo panning of the bump and scrape noises (indicating the angle of impact) is only done for your own bumps and scrapes. As other players bump into walls, you hear their grunts as ordinary point sound sources, emanating from the point at which they hit the wall.
All of the other visible objects in our game are assigned a reasonable, regular idling sound so they are always audible. We've created a noisy, silly, fun environment, choosing objects that appeal to both the eyes and the ears. Walk around a level and you might hear chickens clucking, a grandfather clock ticking, and pigs squealing as they're picked up and thrown.
All of these noises are occluded by the walls of the level, which limits the number of sound sources audible at any one time and prevents blind players from trying to walk through walls to get to objects that they can hear but can't see. Sound occlusion is a wonderful thing. Our implementation silences sounds if the line segment from the center of the listener's head to the center of the sound source intersects any of the walls of the level. The result is not physically accurate, but works very well as a navigation aid and was easy to implement. Occlusion also prevents sighted players from becoming frustrated trying to find a path to an object that they can hear right next door.
Ambient noises. After working through the navigation considerations to allow blind folks to move around our 3D world, we then added audio decoration to make the world more interesting. We tried to give each part of the level a distinct character by adding audible landmarks. For example, you might hear the sounds of pots and pans clanking in the kitchen or hear a stately old grandfather clock ticking in the study. We added a generic arbitrary_noise object type to make it easy for our level designer and audio engineer to sprinkle interesting sound throughout the level.
We also implemented a quick and dirty form of environmental audio. If our minimum system requirements had allowed it, we would have used EAX environmental audio (more on that later).
Room center entities create different auditory environments for each room.
created room_center objects and placed them around the level (see Figure
2). They are simply invisible boxes that the level designers used to mark
out the various rooms in the level. One of the attributes of the room_center
object is footstepNoise. By using different footstep noises for different
rooms, we give the impression of the player being in different environments.
A carpeted study has quiet footsteps, while a kitchen has sharp, echoing
footsteps. It was easy to modify the game engine's footstep-playing code
to play the appropriate footstep noise depending on what room_center object
the player is in.
Sounds other than footsteps should also be affected by the sonic properties of the room. That's easy for any audio sources that are part of the room; our audio engineer just "precompiles" the room's environment into the sound file. In our game, objects that can walk or be carried into rooms sound the same no matter where they are, which is unfortunate but not a huge problem.
After conquering the navigation problems, making all of the gameplay accessible via an audio-only interface was easy and fun. One of the goals in our game is to collect a set of objects, so we had to figure out how to tell players what they need to collect. We associate a name sound with each type of object. The name sound is just the narrator reciting the name of the object (for example, "chicken" or "water balloon"), so telling players what they need to collect is just a matter of stringing together a narrative introduction ("To complete the whatchamacallit, you'll need . . .") with the name sounds for each item on the list.
We also play the name sound when the player bumps into an object. This improves gameplay for both blind and sighted players, especially for objects that might look or sound unfamiliar. I played Quake 1 for several days before I figured out that the floating blue Q-like thing made me do more damage to enemies. I had never noticed the tiny text message "You got the quad damage" scroll by on the status bar; I would have been clued in more quickly if I had heard "Quad damage!" announced when I ran across it as happens in Quake 2 and 3, instead of a generic beep sound.
We could also use name sounds to implement an audible inventory, reciting the list of objects that the player is holding. However, we've chosen to limit the number of objects a player can hold to just two (one in each hand), so instead we just play the objects' idling sounds once when the inventory key is pressed. We put artificial stereo panning to good use again, playing the left-hand object's sound in the left ear and the right-hand object's sound in the right ear.
We follow a couple of general design principles to ensure our game is fully accessible to blind players. First, we make sure that if two items look different, they must sound different. That isn't usually a problem; most objects in the real world make unique sounds, if they make any sound at all. We just avoid populating our game with items that make no sound.
We also make sure that item or game state changes are accompanied by audio cues. For example, items make a "grabbed" sound when they are picked up. Pick up a chicken and you hear it squawk. While it's in your hand, it will make a disgruntled clucking noise, instead of its normal, "I'm a happy chicken" noise.
Stuff We Haven't Figured Out Yet
As I write this, there are still a few problems that we haven't solved and a few solutions that we haven't tried. The thorniest issue is the up/down, front/back problem.
We are using the simplest possible stereo spatialization algorithm for 3D sound sources, which makes it impossible to distinguish whether a sound source is behind or in front of (or above or below) the listener. Preliminary experimentation with the HRTF (head-related transfer function) algorithms built into DirectX is discouraging - the more complicated algorithms sound better but aren't good enough to tell players whether objects are ahead of or behind them. We are currently experimenting with nonrealistic techniques to indicate the fore/aft position of objects with respect to the listener. Since our game doesn't require unrestricted, up-and-down 3D movement in order to be fun, we're not going to do anything to indicate the up/down position of objects.
Player-generated text, such as player names or text chat, is a problem for which we don't yet have a solution. The standards for accessing text-to-speech functionality under Windows are just emerging, so even though we can assume that all of our blind players have a speech synthesizer for converting text into speech already installed on their systems, we have no way of sending text to the synthesizer to be spoken. We may end up licensing a synthesizer to include with the game, but for now we're simply avoiding features that would require text-to-speech conversion.
We intend to reintroduce doors to the game, because the simple mechanisms of allowing doors to be open or shut and locked or unlocked will add strategic elements and make the game more interesting. When we do, we will probably modify the occlusion algorithm so that closed doors muffle sounds coming through them. That, combined with hallway and room noises, should solve the problems we had earlier with blind players being unable to find the exits in the level. We will still have to figure out how to tell a blind player there is an open door nearby that can be closed or locked.
As mentioned earlier, we are using a quick and dirty hack to approximate true environmental audio. Implementing EAX environmental audio is on our list of things to do, but has been a low priority because we can't assume that our players will have an EAX-capable sound card. We think that supporting EAX will increase the quality of the game, but don't think it will improve accessibility or gameplay.
Better Games for Everybody
Oxo's Good Grips brand kitchen tools were designed for people with arthritis or other joint problems (see www.oxo.com/eyeonoxo for the full story). They've been hugely successful selling them to able-bodied people. I don't have arthritis, but I own their potato peeler, ice cream scoop, and cheese grater. Designing products that work for people with disabilities creates products that work better for everybody. All of the techniques we've used to make Zform games blind-accessible can be applied to any game. None of them makes the game any harder for sighted people to play; on the contrary, most of them either help to reinforce the graphical interface or make the game more interesting and fun.
Copyright © 2003 CMP Media Inc. All rights reserved.