Gamasutra is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Gamasutra: The Art & Business of Making Gamesspacer
Read My Lips: Facial Animation Techniques
View All     RSS
August 5, 2020
arrowPress Releases
August 5, 2020
Games Press
View All     RSS

If you enjoy reading this site, you might also want to check out these UBM Tech sites:


Read My Lips: Facial Animation Techniques

April 6, 2000 Article Start Previous Page 3 of 3

Art Issues

These viseme descriptions are enough to realistically represent speech. However, the use of individual visemes is more an artistic judgement then a hard rule. When speaking, people tend to slur phonemes together. They do not clearly articulate each phoneme all the time. Also, the look of a viseme can change depending on the visemes that surround it. For example, the Disney guidelines describe the use of a slightly different viseme for B, P, and M if they precede the ea sound as in beat.

This dependency on surrounding sounds is called co-articulation and makes viseme choice more complicated. This is one reason that the automatic phoneme recognition software in some packages doesn’t always provide realistic results. Smooth interpolation between viseme keyframes can help, but this alone may not be good enough. In many cases, it requires an artistic judgement for which viseme really looks best. In computer animation, realistic looks are all that matter. So, when you work, put in the viseme that looks best.

Emphasis and exaggeration are also very important in animation. You may wish to punch up a sound by the use of a viseme to punctuate the animation. This emphasis along with the addition of secondary animation to express emotion is key to a believable sequence.

In addition to these viseme frames, you will want to have a neutral frame that you can use for pauses. In fast speech, you may not want to add the neutral frame between all words, but in general it gives good visual cues to sentence boundaries.

Side view of the sound
[m], as in “my.”
Side view of the sound
[b], as in “buy.”
Side view of the sound
[p], as in “pie.”

So What Do I Do with This Stuff?

So far, I have been discussing issues that only seem important to the artists working on the facial animation. If the only use of facial animation in your project is for pre-rendered cut scenes, this may be true. However, I believe facial animation will become an important aspect in real-time 3D rendering as we take character simulation to the next level. This requires a close relationship between the art assets and engine features.

As a technical lead on a cutting-edge 3D project, you will be required to create the production pathway that the artists will use to create assets. You will be responsible for deciding how many visemes the engine can support and the manner in which the meshes must be created. Having a clear understanding of what goes into the creation of the assets will allow you to interface more effectively with those creating the assets.

However, even with the viseme count I am still not ready to set the artists loose creating my viseme frames. There are several basic engine decisions that I must make before modeling begins. Unfortunately, I will have to wait until the next column to dig into that. Until then, think back on my 3D morphing column (“Mighty Morphing Mesh Machine,” December 1998) as well as last year’s skeletal deformation column (“Skin Them Bones,” Graphic Content, May 1998) and see if you can get a jump on the rest of the class.


Special thanks go to my partner in crime, Margaret Pomeroy. She was able to explain to me what was really going on when I made all those funny faces in the mirror. When she was studying ancient languages in school I am sure she never imagined working on lip-synching character dialog.

For Further Info

• Culhane, Shamus. Animation from Script to Screen. New York: St. Martin’s Press, 1988.

• Ladefoged, Peter. A Course in Phonetics. San Diego: Harcourt Brace Jovanovich, 1982.

• Maestri, George. [digital] Character Animation. Indianapolis: New Riders Publishing, 1996.

• Parke, Frederic I. and Keith Waters. Computer Facial Animation. Wellesley: A. K. Peters, 1996.

Jeff Lander often sounds like he knows what he’s talking about. Actually, he’s just lip-synched to someone who really know what’s going on. Let him know you are on to the scam at [email protected].

Article Start Previous Page 3 of 3

Related Jobs

Disbelief — Cambridge, Massachusetts, United States

Mountaintop Studios
Mountaintop Studios — Los Angeles, California, United States

Engine/Systems Engineer (remote)
Mountaintop Studios
Mountaintop Studios — Los Angeles, California, United States

Graphics Engineer (remote)
Yacht Club Games
Yacht Club Games — Los Angeles, California, United States

Senior 3D Technical Artist

Loading Comments

loader image