Science
Break
The
field of linguistics, specifically phonetics, compares phonemes according
to their actual physical attributes. The grouping does not really concentrate
on the visual aspects, as sounds rely on things going on in the throat
and in the mouth, as well as on the lips. But, perhaps this can help
me organize the phonemes a bit.
Sounds
can be categorized according to voicing, manner of articulation (airflow),
and the places of articulation. There are more, but these will get the
job done. As speakers of English, we automatically create sounds correctly
without thinking about what is going on inside the mouth. Yet, when
we see a bad animation, we know it doesn’t look quite right although
we may not know why. With the information below, you will be equipped
to know why things look wrong. Now for some group participation. This
is an interactive article. Go on, no one is looking. The categories
we want to examine are:
Voiced
vs. Voiceless. Put your hand on your throat and say something. You
can feel an intermittent vibration. Now say, “p-at, b-at, p-at, b-at,”
(emphasizing the initial consonant). Looking at the face, there is no
visual difference between voiced and voiceless sounds. In some sounds
the vocal cords are vibrating together (b-voiced) and in some the vocal
cords are apart (p- voiceless). This is an automatic no-brainer as far
as reducing sounds into one viseme. Any pair of sounds that is only
different because of voicing can be reduced to the same viseme. In English,
that eliminates eight phonemes.
Nasal
vs. oral. Put your fingers on your nose. Slowly say “momentary.”
You can feel your nose vibrating when you are saying the “m.” Some sounds
are said through the nasal cavity, but most are said through the oral
cavity. These are also not visibly different. So again, we have an automatic
reduction in phonemes. All three nasal sounds in English can be included
in the oral viseme counterpart.
Manners
of Speech. Sounds can also be differentiated by the amount of opening
through the oral tract. These also do not offer a visible clue, but
are very important for categorizing phonemes. Sounds that have complete
closure of the airstream are called stops. Sounds that have a partially
obstructed closure and turbulent airflow are called fricatives. A sound
that combines a stop/fricative is called an affricate. Sounds that have
a narrowing of the vocal tract, but no turbulent airflow, are called
approximates. And then there are sounds that have relatively no obstruction
of the airflow; these are the vowels.
|
|
|
Figure
2. Side cut-out view of places of articulation.
|
Places
of Articulation. This involves where the sound is being made in
the mouth. This is where the visible differences occur. There are several
places of articulation (see Figure 2) involving the lips, teeth, tongue,
and stuff in the back of the mouth (the palate, velum, and glottis)
for the consonants. Vowel placement is based on the relative height
of the tongue and whether the tongue is more front or back in the mouth.
A differentiating factor not listed in Chart 1 is lip rounding. This
is not associated with any particular place of articulation and will
be addressed below. Whew.
As
I said, there are 35 phonemes in my dialect of American English. You
may have more. Chart 1 is a summary of these phonemes. Read the chart
from the front of the mouth to the back of the mouth. Try saying each
of the words that illustrate the phoneme that is in bold. Have a look
in the mirror and see what is going on as well as feel what is going
on inside the head. By using the distinction of voicing and oral/nasal,
we have already eliminated 11 phonemes. Let’s continue the reduction
of phonemes into the usable visemes.
Take
It to the Limit
According
to the chart, there are three bilabials, which are sounds made with
both lips. They are [b], [p], and [m]. According to the Figures 3a,
3b, and 3c they have different attributes inside the mouth. B and P
only differ in that the B makes use of the vocal cords and P does not.
The M sound is nasal and voiced so it is similar to the B sound, but
it is a nasal sound. The cool thing about these sounds is that while
there are differences inside the mouth, visually there is no difference.
If you look in a mirror and say “buy,” “pie,” and “my” they all look
identical. We have reduced three phonemes into one viseme.
 |
|
Chart
1. American English phoneme summary chart.
|
While
you’re working, remember that you are thinking with respect to sounds
(phonemes), not letters. In many cases a phoneme is made up of multiple
letters. So, if we go through Chart 1, we can continue to reduce the
35 phonemes into 13 visemes. For the most part, the visemes are categorized
along the lines of the Places of Articulation (with the exception of
[r]).
Take
a look at the following listing of visemes. It describes the look of
each phoneme in American English. The only phoneme not listed is [h].
“In English, ‘h’ acts like a consonant, but from an articulatory point
of view it is simply the voiceless counterpart of the following vowel.”
(Ladefoged, 1982:33-4). In other words, treat [h] like the vowel that
comes after it.
|
Visemes
1.
[p, b, m] - Closed lips.
2.
[w] & [boot] - Pursed lips.
3.
[r*] & [book] - Rounded open lips with corner of lips slightly
puckered. If you look at Chart 1, [r] is made in the same place
in the mouth as the sounds of #7 below. One of the attributes
not denoted in the chart is lip rounding. If [r] is at the beginning
of a word, then it fits here. Try saying “right” vs. “car.”
4.
[v] & [f ] - Lower lip drawn up to upper teeth.
5.
[thy] & [thigh] - Tongue between teeth, no gaps on sides.
6.
[l] - Tip of tongue behind open teeth, gaps on sides.
7.
[d,t,z,s,r*,n] - Relaxed mouth with mostly closed teeth with pinkness
of tongue behind teeth (tip of tongue on ridge behind upper teeth).
8.
[vision, shy, jive, chime] Slightly open mouth with mostly closed
teeth and corners of lips slightly tightened.
9.
[y, g, k, hang, uh-oh] - Slightly open mouth with mostly closed
teeth.
10.
[beat, bit] - Wide, slightly open mouth.
11.
[bait, bet, but] - Neutral mouth with slightly parted teeth and
slightly dropped jaw.
12.
[boat] - very round lips, slight dropped jaw.
13.
[bat, bought] - open mouth with very dropped jaw.
|
To
see how helpful this information can be when animating a face take a
word like “hack.” It has four letters, three phonemes, and only two
visemes (13 and 9 in the listing).
Say
that you don’t have enough space to include 13 visemes and whatever
emotions you want expressed. Well, by using Chart 1 and the list of
visemes in the listing, you can make logical decisions of where to cut.
For example, if you only have room for 12 visemes, you can combine viseme
5 and 6 or 6 and 7 below. For 11 visemes, continue combining visemes
by incorporating viseme 7 and 9 below. For 10, combine visemes 2 and
3. For 9, combine 8 with the new viseme 7/9. For 8, combine 11 and 13.
If
I were really pressed for space, I could keep combining and drop this
list down further. Most drastic would be three frames (Open, Closed,
and Pursed as in boot) or even a simple two frames of lip flap open
and closed. In this case you would just alternate between opened and
closed once in a while. But that isn’t very fun or realistic, is it?
_______________________________________________________________
Art
Issues