My Message close
GAME JOBS
Contents
Read My Lips: Facial Animation Techniques
 
 
Printer-Friendly VersionPrinter-Friendly Version
 
Latest Jobs
spacer View All     Post a Job     RSS spacer
 
May 23, 2013
 
2K Games
Tools Programmer - 2K Games
 
2K Games
Graphics Programmer - 2K Games
 
2K Games
Engine Programmer - 2K Games
 
GREE International
Senior Product Manager, Growth and Revenue
 
GREE International
Business Intelligence Data Analyst
 
Synergy Blue
3D Artist / Animator
spacer
Latest Blogs
spacer View All     Post     RSS spacer
 
May 23, 2013
 
Letting the Player Find the Fun
 
Using Small Studios As Stepping Stones In Your Career [3]
 
Maturity, Challenge, Art and Games
 
Combat Analysis: Guacamelee [1]
 
Kickstarter Fu
spacer
About
spacer Editor-In-Chief:
Kris Graft
Blog Director:
Christian Nutt
Senior Contributing Editor:
Brandon Sheffield
News Editors:
Mike Rose, Kris Ligman
Editors-At-Large:
Leigh Alexander, Chris Morris
Advertising:
Jennifer Sulik
Recruitment:
Gina Gross
Education:
Gillian Crowley
 
Contact Gamasutra
 
Report a Problem
 
Submit News
 
Comment Guidelines
Sponsor
Features
  Read My Lips: Facial Animation Techniques
by Jeff Lander [Programming]
1 comments Share on Twitter Share on Facebook RSS
 
 
April 6, 2000 Article Start Page 1 of 3 Next
 

Anyone who has ever been in a professional production situation realizes that real-world coding these days requires a broad area of expertise. When this expertise is lacking, developers need to be humble enough to look things up and turn to people around them who are more experienced in that particular area.

As I continue to explore areas of graphics technology, I have attempted to document the research and resources I have used in creating projects for my company. My research demands change from month to month depending on what is needed at the time. This month, I have the need to develop some facial animation techniques, particularly lip sync. This means I need to shelve my physics research for a bit and get some other work done. I hope to get back to moments of inertia, and such, real soon.



And Now for Something Completely Different

My problem right now is facial animation. In particular, I need to know enough in order to create a production pathway and technology to display real-time lip sync. My first step when trying to develop new technology is to take a historic look at the problem and examine previous solutions. The first people I could think of who had explored facial animation in depth were the animators who created cartoons and feature animation in the early days of Disney and Max Fleischer.

Facial animation in games has built up on this tradition. Chiefly, this has been achieved through cut-scene movies animated using many of the same methods. Games like Full Throttle and The Curse of Monkey Island used facial animation for their 2D cartoon characters in the same way that the Disney animators would have. More recently, games have begun to include some facial animation in real-time 3D projects. Tomb Raider has had scenes in which the 3D characters pantomime the dialog, but the face is not actually animated. Grim Fandango uses texture animation and mesh animation for a basic level of facial animation. Even console titles like Banjo Kazooie are experimenting with real-time “lip-flap” without even having a dialog track. How do I leverage this tradition into my own project?

Phonemes and Visemes

No discussion of facial animation is possible without discussing phonemes. Jake Rodgers’s article “Animating Facial Expressions” (Game Developer, November 1998) defined a phoneme as an abstract unit of the phonetic system of a language that corresponds to a set of similar speech sounds. More simply, phonemes are the individual sounds that make up speech. A naive facial animation system may attempt to create a separate facial position for each phoneme. However, in English (at least where I speak it) there are about 35 phonemes. Other regional dialects may add more.

Now, that’s a lot of facial positions to create and keep organized. Luckily, the Disney animators realized a long time ago that using all phonemes was overkill. When creating animation, an artist is not concerned with individual sounds, just how the mouth looks while making them. Fewer facial positions are necessary to visually represent speech since several sounds can be made with the same mouth position. These visual references to groups of phonemes are called visemes. How do I know which phonemes to combine into one viseme? Disney animators relied on a chart of 12 archetypal mouth positions to represent speech as you can see in Figure 1.

Figure 1. The 12 classic Disney mouth positions.

Each mouth position or viseme represented one or more phonemes. This reference chart became a standard method of creating animation. As a game developer, however, I am concerned with the number of positions I need to support. What if my game only has room for eight visemes? What if I could support 15 visemes? Would it look better?

Throughout my career, I have seen many facial animation guidelines with different numbers of visemes and different organizations of phonemes. They all seem to be similar to the Disney 12, but also seem like they involved animators talking to a mirror and doing some guessing.

I wanted to establish a method that would be optimal for whatever number of visemes I wanted to support. Along with the animator’s eye for mouth positions, there are the more scientific models that reduce sounds into visual components. For the deaf community, which does not hear phonemes, spoken language recognition relies entirely on lip reading. Lip-reading samples base speech recognition on 18 speech postures. Some of these mouth postures show very subtle differences that a hearing individual may not see.

So, the Disney 12 and the lip reading 18 are a good place to start. However, making sense of the organization of these lists requires a look at what is physically going on when we speak. I am fortunate to have a linguist right in the office. It’s times like this when it helps to know people in all sorts of fields, no matter how obscure.

 
Article Start Page 1 of 3 Next
 
Top Stories

image
Xbox One is Microsoft's biggest play for living room domination
image
Opinion: Xbox One is a desperate prayer to stop time
image
Indies on Xbone: Where's the beef?
image
'If you're backwards compatible, you're really backwards.'
Comments

fateh Ben Merzoug
profile image
Very nice article!


none
 
Comment:
 




UBM Tech