Building an AI Sensory System: Examining The Design of Thief: The Dark Project
March 7, 2003 Page 2 of 2
Rather than having a single two-dimensional field of view, the Thief senses implement a set of ordered three-dimensional viewcones described as an XY angle, a Z angle, a length, a set of parameters describing both general acuity and sensitivity to types of stimuli (e.g., motion versus light), and relevance given the alertness of the AI. The viewcones are oriented according to the direction an AI's head is facing.
At any time for a given object being sensed, only the first view cone the object is in is considered in sense calculations. For simplicity and gameplay tunability, each viewcone is presumed to produce a constant output regardless of where in the viewcone the subject is positioned.
For example, the AI represented in Figure 4 has five viewcones. An object at point A will be evaluated using viewcone number 3. The viewcone used for calculating the vision sense awareness for an entity at either point B and point C is viewcone number 1, where identical visibility values for an object will yield the same result.
Figure 4, Viewcones, Top-view
When probing interesting objects in the world, the senses first determine which viewcone, if any, applies to the subject. The intrinsic visibility is then passed through a "look" heuristic along with the viewcone to output a discrete awareness value.
The motivation for multiple viewcones is to enable the expression of such things as direct vision, peripheral vision, or a distinction between objects directly forward and on the same Z plane as opposed to forward but above and below. Cone number 5 in the diagram above is a good example of leveraging the low-level to express a high level concept. This "false vision" cone is configured to look backwards and configured to be sensitive to motion, giving the AI a "spidey-sense" of being followed too closely even if the player is silent.
The sense management system is designed as a series of components each taking a limited and well-defined set of data and outputting an even more limited value. Each stage is intended to be independently scalable in terms of the processing demands based on relevance to game play. In terms of performance, these multiple scalable layers can be made to be extremely efficient.
Figure 5, Information Pipeline
The core sensory system implements heuristics for accepting visibility, sound events, current awareness links, designer and programmer configuration data, and current AI state, and outputting a single awareness value for each object of interest. These heuristics are considered a black box tuned by the AI programmer continually as the game develops.
Vision is implemented by filtering the visibility value of an object through the appropriate viewcone, modifying the result based on the properties of the individual AI. In mundane cases a simple raycast for line-of-sight is used. In more interesting cases, like the player, multiple raycasts occur to include the spatial relation of the AI to the subject in the weighing of the subject's exposure.
Thief has a sophisticated sound system wherein sounds both rendered and not rendered were tagged with semantic data and propagated through the 3D geometry of the world. When a sound "arrived" at an AI, it arrived from the directions it should in the real world, tagged with attenuated awareness values, possibly carrying information from other AIs if it was a spoken concept. These sounds join other awareness inducing things (like the Half-Life smell example) as awareness relations to positions in space.
Once the look and listen operations are complete, their awareness results are passed to a method responsible for receiving periodic pulses from the raw senses, and resolving them into a single awareness relationship, storing all the details in the associated sense link. Unlike the analog data used in the pipeline to this point, the data in this process is entirely discrete. The result of this process is to create, update, or expire sense links with the correct awareness value.
This is a three-step process. First, the sound and vision input values are compared, one declared dominant, and that becomes the value for awareness. The accessory data each produces is then distilled together into a summary of the sense event.
Second, if the awareness pulse is an increase from previous readings, it is passed through a time-based filter that controls whether the actual awareness will increase. The time delay is a property only of the current state, not the goal state. This is how reaction delays and player forgiveness factors are implemented. Once the time threshold is passed, the awareness advances to the goal state without passing through intermediate states.
Finally, if the new pulse value is below current readings, a capacitor is used to allow awareness to degrade gradually and smoothly. Awareness decreases across some amount of time, passing through all the intermediate states. This softens the behavior of the AI once the object of interest is no longer actively sensed, but is not the mechanism by which the core AI's alertness is controlled.
If an object of interest is no longer generating pulses, the senses incorporate a degree of free knowledge which is scaled based on the state of the AI. This mechanism produces the appearance of deduction on the part of the AI when an object has left the field of view without overtly demonstrating cheating to the player.
The system described here was designed for a single-player software rendered game. Because of this, all authoritative information about game entities was available to it. Unfortunately, in a game engine with a client/server architecture and a hardware-only renderer, this may not be true. Determining the lit-ness field of an object's visibility may not be straightforward. Thus incorporating such a system as described here is something to do deliberately and with care, as it will place information demands on other systems.
Furthermore, although efficient in what it does, it is designed for a game that in many ways centers around the system's output. In Thief it consumes a non-trivial amount of the AI's CPU budget. This will take time away from pathing, tactical analysis, and other decision processes.
However, there are benefits to be had for any game to invest in their sensing code. By gathering and filtering more information about the environment and serving it up in a well-defined manner, senses can be leveraged to produce engaging AI behaviors without significantly increasing the complexity of the decision state machines. A robust sense system also provides a clean hook for expressing "pre-conscious" behaviors by controlling and manipulating the core knowledge inputs. Finally, a multi-state sense system provides the player with an AI opponent or ally that exhibits varied and subtle reactions and behaviors without adding complexity to the core decision machines.
Because of the highly data-driven nature of the Dark Engine on which Thief was built, most of the concepts presented in this paper and all of the configuration details may be explored first-hand using a copy of the tools available at http://www.thief-thecircle.com.
Page 2 of 2