Art
Issues
These
viseme descriptions are enough to realistically represent speech. However,
the use of individual visemes is more an artistic judgement then a hard
rule. When speaking, people tend to slur phonemes together. They do
not clearly articulate each phoneme all the time. Also, the look of
a viseme can change depending on the visemes that surround it. For example,
the Disney guidelines describe the use of a slightly different viseme
for B, P, and M if they precede the ea sound as in beat.
This
dependency on surrounding sounds is called co-articulation and makes
viseme choice more complicated. This is one reason that the automatic
phoneme recognition software in some packages doesn’t always provide
realistic results. Smooth interpolation between viseme keyframes can
help, but this alone may not be good enough. In many cases, it requires
an artistic judgement for which viseme really looks best. In computer
animation, realistic looks are all that matter. So, when you work, put
in the viseme that looks best.
Emphasis
and exaggeration are also very important in animation. You may wish
to punch up a sound by the use of a viseme to punctuate the animation.
This emphasis along with the addition of secondary animation to express
emotion is key to a believable sequence.
In
addition to these viseme frames, you will want to have a neutral frame
that you can use for pauses. In fast speech, you may not want to add
the neutral frame between all words, but in general it gives good visual
cues to sentence boundaries.
|
|
|
Side
view of the sound
[m], as in “my.”
|
Side
view of the sound
[b], as in “buy.”
|
Side
view of the sound
[p], as in “pie.”
|
So
What Do I Do with This Stuff?
So
far, I have been discussing issues that only seem important to the artists
working on the facial animation. If the only use of facial animation
in your project is for pre-rendered cut scenes, this may be true. However,
I believe facial animation will become an important aspect in real-time
3D rendering as we take character simulation to the
next level. This requires a close relationship between the art assets
and engine features.
As
a technical lead on a cutting-edge 3D project, you will be required
to create the production pathway that the artists will use to create
assets. You will be responsible for deciding how many visemes the engine
can support and the manner in which the meshes must be created. Having
a clear understanding of what goes into the creation of the assets will
allow you to interface more effectively with those creating the assets.
However,
even with the viseme count I am still not ready to set the artists loose
creating my viseme frames. There are several basic engine decisions
that I must make before modeling begins. Unfortunately, I will have
to wait until the next column to dig into that. Until then, think back
on my 3D morphing column (“Mighty Morphing Mesh Machine,” December 1998)
as well as last year’s skeletal deformation column (“Skin Them Bones,”
Graphic Content, May 1998) and see if you can get a jump on the rest
of the class.
Acknowledgements
Special
thanks go to my partner in crime, Margaret Pomeroy. She was able to
explain to me what was really going on when I made all those funny faces
in the mirror. When she was studying ancient languages in school I am
sure she never imagined working on lip-synching character dialog.
For
Further Info
•
Culhane, Shamus. Animation from Script to Screen. New York: St.
Martin’s Press, 1988.
•
Ladefoged, Peter. A Course in Phonetics. San Diego: Harcourt
Brace Jovanovich, 1982.
•
Maestri, George. [digital] Character Animation. Indianapolis:
New Riders Publishing, 1996.
•
Parke, Frederic I. and Keith Waters. Computer Facial Animation.
Wellesley: A. K. Peters, 1996.
Jeff Lander often sounds like he knows what he’s talking about. Actually,
he’s just lip-synched to someone who really know what’s going on. Let
him know you are on to the scam at [email protected].