Modeling for computer games addresses the challenge of automating a variety of difficult development tasks. An early milestone was the combination of geometric models and inverse kinematics to simplify keyframing. Physical models for animating particles, rigid bodies, deformable solids, fluids, and gases have offered the means to generate copious quantities of realistic motion through dynamic simulation. Biomechanical modeling employs simulated physics to automate the lifelike animation of animals with internal muscle actuators. Research in behavioral modeling is making progress towards self-animating characters that react appropriately to perceived environmental stimuli. It has remained difficult, however, to instruct these autonomous characters so that they satisfy the programmer's goals. Hitherto absent in this context has been a substantive apex to the computer graphics modeling pyramid (Figure 1), which we identify as cognitive modeling.
Figure 1. Cognitive modeling is the new
apex of the CG modeling hierarchy
Cognitive models go beyond behavioral models, in that they govern what a character knows, how that knowledge is acquired, and how it can be used to plan actions. Cognitive models are applicable to instructing the new breed of highly autonomous, quasi-intelligent characters that are beginning to find use in interactive computer games. Moreover, cognitive models can play subsidiary roles in controlling cinematography and lighting. See the color plates at the end of this article for some screenshots from two cognitive modeling applications.
We decompose cognitive modeling into two related sub-tasks: domain knowledge specification and character instruction. This is reminiscent of the classic dictum from the field of artificial intelligence (AI) that tries to promote modularity of design by separating out knowledge from control.
knowledge + instruction = intelligent behavior
Domain (knowledge) specification involves administering knowledge to the character about its world and how that world can change. Character instruction involves telling the character to try to behave in a certain way within its world in order to achieve specific goals. Like other advanced modeling tasks, both of these steps can be fraught with difficulty unless developers are given the right tools for the job.
The situation calculus is the mathematical logic notation we will be using and it has many advantages in terms of clarity and being implementation agnostic, but it is somewhat of a departure from the repertoire of mathematical tools commonly used in computer graphics. We shall therefore overview in this section the salient points of the situation calculus, whose details are well-documented in the book [Funge99] and elsewhere [LRLLS97,LLR99]. It is also worth mentioning that from a user's point of view the underlying theory can be hidden. In particular, a user is not required to type in axioms written in first-order mathematical logic. In particular, we have developed an intuitive high-level interaction language CML (Cognitive Modeling Language) whose syntax employs descriptive keywords, but which has a clear and precise mapping to the underlying formalism (see the book [Funge99], or website www.cs.toronto.edu/~funge, for more details ).
The situation calculus is an AI formalism for describing changing worlds using sorted first-order logic. A situation is a "snapshot" of the state of the world. A domain-independent constant s0 denotes the initial situation. Any property of the world that can change over time is known as a fluent. A fluent is a function, or relation, with a situation term (by convention) as its last argument. For example, Broken(x, s) is a fluent that keeps track of whether an object x is broken in a situation s.
Primitive actions are the fundamental instrument of change in our ontology. The sometimes counter-intuitive term "primitive" serves only to distinguish certain atomic actions from the "complex", compound actions that we will defined earlier. The situation s' resulting from doing action a in situation s is given by the distinguished function do, so that s' = do(a,s). The possibility of performing action a in situation s is denoted by a distinguished predicate Poss (a,s). Sentences that specify what the state of the world must be before performing some action are known as precondition axioms. For example, it is possible to drop an object x in a situation s, if and only if a character is holding it:
The effects of an action are given by effect axioms. They give necessary conditions for a fluent to take on a given value after performing an action. For example, the effect of dropping a fragile object x is that the object ends up being broken
Surprisingly, a naive translation of effect axioms into the situation calculus does not give the expected results. In particular, stating what does not change when an action is performed is problematic. This is called the "frame problem" in AI. That is, a character must consider whether dropping a cup, for instance, results in, say, a vase turning into a bird and flying about the room. For mindless animated characters, this can all be taken care of implicitly by the programmer's common sense. We need to give our thinking characters this same common sense. They need to be told that they should assume things stay the same unless they know otherwise. Once characters in virtual worlds start thinking for themselves, they too will have to tackle the frame problem. The frame problem has been a major reason why approaches like ours have not previously been used in computer animation or until recently in robotics. Fortunately, the frame problem can be solved provided characters represent their knowledge with the assumption that effect axioms enumerate all the possible ways that the world can change. This so-called closed world assumption provides the justification for replacing the effect axioms with successor state axioms. For example, the following successor state axiom says that, provided the action is possible, then a character is holding an object if and only if it just picked up the object or it was holding the object before and it did not just drop the object:
We distinguish two broad possibilities for instructing a character on how to behave: predefined behavior and goal-directed behavior. Of course, in some sense, all of a character's behavior is defined in advance by the animator/programmer. Therefore, to be more precise, the distinction between predefined and goal-directed behavior is based on whether the character can nondeterministically select actions or not.
What we mean by nondeterministic action selection is that whenever a character chooses an action it also remembers the other choices it could have made. If, after thinking about the choices it did make, the character realizes that the resulting sequence of actions will not result in a desirable outcome, then it can go back and consider any of the alternative sequence of actions that would have resulted from a different set of choices. It is free to do this until it either finds a suitable action sequence, or exhausts all the (possibly exponential number of) possibilities.
A character that can nondeterministically select actions is usually a lot easier to instruct, but has a slower response time. In particular, we can tell a cognitive character what constitutes a "desirable outcome" by giving it goals, and it can then use its background domain knowledge to figure out whether it believes a given action sequence will achieve those goals or not. Although we are using the word "nondeterministic" in a precise technical sense, the trade-off between execution speed and programming effort should already be a familiar and intuitive concept for many readers.
A third possibility we will consider is something of a compromise between the two extremes of predefined and goal-directed behavior. In particular, we introduce the notion of complex actions and explain how they can be used to provide goals, and a "sketch plan" for how to achieve those goals.
Before we continue, it is worth pointing out that sometimes people identify a particular class of programming languages with a particular kind of behavior. For example, logic programming languages are often associated with nondeterministic goal-directed behavior, and regular imperative languages with deterministic predefined behavior. While it is true that logic programming languages have built-in support for nondeterministic programming, there is nothing to stop us implementing either kind of behavior in any programming language we choose (assuming it is Turing complete). To avoid unnecessary confusion, we shall not tie the following discussion to any particular programming languages.