It's free to join Gamasutra!|Have a question? Want to know who runs this site? Here you go.|Targeting the game development market with your product or service? Get info on advertising here.||For altering your contact information or changing email subscription preferences.
Registered members can log in here.Back to the home page.

Search articles, jobs, buyers guide, and more.

 

 

Letters to the Editor:
Write a letter
View all letters


Features

Talking Heads: Facial Animation in The Getaway

The System

We have talked about the basics of facial animation, why we chose a skeleton-based system, and how we put this into practice. The next step is to explain exactly how Talking Heads works.

As I've mentioned before, the point of a system like this is to reduce the workload and demands on a small group of animators working on a large project. The only way that this can happen is to hand over some of the more tedious tasks of facial animation to the computer.

Our facial animation system works on three levels: the first is concentrated around achieving believable lip-synching, the second around laying down blocks of emotions, and the third on underlying secondary animation such as blinking or breathing.

Lip-synching. The first step is to record an uncompressed 44kHz .WAV file of the chosen actor and script. A good point to mention here is that your script should contain a series of natural pauses. A good actor or voice-over artist should give you this automatically. Remember, you want the best performance you can get. The sound file contains all the hints you will need to animate emotions and will carry your animation. The pauses aid the system, allowing it to work out where it is in the .WAV file when it calculates the phonemes.

We then create a text file, which is an exact script of the .WAV file. During the creation of the phonemes, the text file is matched against a phoneme dictionary. There are many such dictionaries on the web, it's just a matter of finding a free one (see For More Information). The dictionary contains a huge list of words and their phoneme equivalents. By checking the script against this dictionary, the system determines the phonemes required to make the words. Some obscure words are not covered, and we enter these into our dictionary by hand.

Most of the development time of Talking Heads was taken up working out how to parse the .WAV file. This is all custom software which enables us to scan through our sound file and work out the timings between the words. We also work out the timing between phonemes, which is very important.

Talking Heads then lays down keyframes for the phonemes in Maya. It does this by taking the information from the dictionary and the .WAV file and matching them, phoneme against length of time. As mentioned before these keys are assigned to the locator that controls the phonemes. This allows for easy editing of the phonemes at a later stage by an animator, or the creation of a complete new phoneme animation if the producer decides that he wants to change the script. So a one-minute animation that could take a week to animate by hand can be created in half an hour. Then the animator is free to refine and polish as he sees fit.

One advantage to the system is the creation of language SKUs. We produce products for a global market, and there is nothing more frustrating than re-doing tedious lip-synching for each country. Talking Heads gets around this problem quite efficiently. You have to create a phoneme set for each language and find a corresponding phoneme dictionary, but once you have done this the system works in exactly the same way as before. You can lay down animations in English, French, German, Japanese, or whatever language you wish.

Emotions. The next step is to add blocks of emotion. To do this we edit the text file that we created from the .WAV file. A simple markup language is used to define various emotions throughout the script.

As you can see, emotions are added and given values. These values correspond with those on the emotion locator. An Anger value of 2.2 gives the character a slight sneer, and by the end of this sentence the character would smirk. In this way, huge amounts of characterization can be added. We video our actors at the time we record the sound, either in the sound studio or the motion capture studio. We can then play back the video recording of the scene we are editing and lay down broad emotions using the actor's face as a guideline.

The advantage of editing a text file is that anyone can do it. You do not have to be an animator or understand how a complicated software package works. As long as the person who is editing knows what the different emotion values look like, they can edit any script. Using the video of the actor's face allows anyone to see which emotions should be placed where and when.

Later on, an animator can take the scene that has been setup using the script and go in and make changes where necessary. This allows our animators to concentrate their talents on more detailed facial animation, adding subtlety and characterization by editing the sliders in the animation system and laying keys down by hand.

Specials. The third area to be covered by the Talking Heads system concentrates on a wide range of subtle human movements. These are the keys to bringing your character to life. Talking Heads takes the text file and creates emotions from the markup language as it matches phonemes and timings. It also sets about laying down a series of secondary animations and keying these to the third locator. As mentioned before, this locator deals with blinking, random head motion, nodding and shaking of the head, breathing, and so on.

Blinking is controlled by the emotion that is set in the text file. If the character has anger set using the markup language, then it will only set blinking keyframes once every six seconds. When angry, the face takes on a scowl, the eyes open wide, and blinking is reduced to show as much whites of the eyes as possible. It has lengths of time for each emotion and will use the one with the highest value as the prime emotion for blinking. Also added is a slight randomness which will occasionally key in a double blink. The normal blinking rate is once every four seconds, and if the character is lying or acting suspiciously this rate increases to once every two seconds.

Random head motion is keyed only when keyframes are present for phonemes. This means that the character always moves his head when he is speaking. This is a subtle effect; be careful with the movement, a little goes a long way. The next pass looks for positive and negative statements. It tracks certain words such as "yes, no, agree, disagree, sure, certainly, never." When it finds such words, it sets keyframes for nodding and shaking of the head. Using the timing from the script, it uses a set of decreasing values on the nod and shake head Set Driven Keys. This gives us very realistic motion.

Breathing is automatic; the system keys values when it reaches the end of a sentence. This value can differ depending on the physical state of the character. Normal values are hardly detectable, while extreme values mimic gasping for breath.

At this stage the system also creates keys for random eye motion. This keeps the character alive at all times. If your character stops moving at any point, the illusion of life is broken.

Set up and ready to go. Once everything has run through Talking Heads, we have a fully animating human head. At this stage an animator has not even overseen the process. Our character blinks, breathes, moves, talks, and expresses a full range of human emotion.

At this point we schedule our animators onto certain scenes and they make subtle changes to improve the overall animation, making sure that the character is reacting to what other characters are saying and doing.

More Refined in Less Time

The process of creating Talking Heads has been a long nine months, and still changes are being made. We continue to tinker and evolve the system to achieve the most believable facial animation seen in a computer game. Whether we have done this successfully will only be seen when The Getaway is eventually released.

The next step is to incorporate Talking Heads into real-time. This would allow our in-game NPCs to react to whatever the player does. This is already in motion and we hope to see this happening in The Getaway.

Facial animation can be achieved without huge animation teams. The process of creating Talking Heads has been an extremely worthwhile experience. We are now able to turn out excellent animations in very short times. Our team of animators is free to embellish facial animation, adding real character and concentrating their efforts on creating the huge amount of animation required for in-game and cutscenes.

Gavin Moore has worked in the games industry for 10 years. He is currently the senior animator on The Getaway at Sony Computer Entertainment Europe's Team Soho. He is in charge of a team of artists and animators responsible for all aspects of character creation and animation in the game. Gavin can be reached at Gavin_Moore@scee.net.


For More Information

Books
Faigin, Gary. The Artist's Complete Guide to Facial Expression. New York: Watson-Guptill, 1990.

Fleming, Bill, and Darris Dobbs. Animating Facial Features and Expressions. Rockland, Mass.: Charles River Media, 1999.

Park, Frederic I., and Keith Waters. Computer Facial Animation. Wellesley, Mass.: A. K. Peters, 1996.

Web Sites
HighEnd3D
www.highend3d.com

3dRender.com
www.3Drender.com

Dictionaries and English Vocabulary Resources
www.notredame.ac.jp/~peterson/URL/research/dictionaries.html

Discuss this article in Gamasutra's discussion forum

________________________________________________________

[Back To] Breaking it Down


join | contact us | advertise | write | my profile
news | features | companies | jobs | resumes | education | product guide | projects | store



Copyright © 2003 CMP Media LLC

privacy policy
| terms of service