Pathfinding AIs tend to fall short in one of two ways: either they are so incredibly effective and efficient that they don't convey intelligence as much as omniscience, or they are so incredibly inept that human observers can't help but feel frustration watching them. The first case is exemplified by an agent which ignores locally acceptable paths in favor of the ideal global path, thereby giving away the secret that it has knowledge of the entire map. The second case can be seen in a wide variety of stupid behavior, but my favorite example is the agent which travels along a river to avoid the relatively small penalty of crossing water, only to plow through an enemy camp and ultimately meet its doom. Anyone familiar with real-time-strategy gaming has no doubt met this breed of pathfinder before.
The problem, assuming we choose not to endow our pathfinders with God-like knowledge of the Universe, is that we are trying to simulate the sort of decision making employed by a computer that we still don't understand - the human brain. Rather than trying to hard code something this complex, perhaps we should be allowing our algorithms to develop over thousands and thousands (or millions and millions) of generations, using the same evolutionary mechanism that nature has relied on all along.
Genetic programming is a technique which uses the genetic operators of reproduction, mutation and cross-breeding to find programs which solve a given problem. The goal is to develop abilities in a population of programs in the same way living organisms have developed their abilities - namely, through evolution. Although at first the idea of computer programs reproducing sounds bizarre, it is less so when you realize that we are simply abstracting biological concepts to the point where we can sensibly apply them to programming.
Genetic programming, and genetic algorithms in general, are approximate methods for finding a solution to a given problem, meaning they aren't guaranteed to produce the correct solution, but rather a solution which has a high probability of being correct. If our problem is to find a reasonably intelligent and realistic pathfinding algorithm, this isn't a deal breaker, since we make no assumption about there being only one correct solution. In fact, anything that looks good to the human eye will do. Ultimately, then, we can set the rules for a population and its method of natural selection, and let it go. We stop the process when the most-fit algorithm looks as if it is behaving both wisely and realistically.
Rules and Requirements
Although applying genetic programming techniques to the pathfinding problem is fairly easy, there are several rules and requirements that must be considered before beginning. This article is based on a project called Hampton, named after the Hampton Court hedge maze. The project is available at http://www.stromcode.com/Hampton along with a technical write-up and source code. Although requirements will vary depending on the project, the Hampton project was developed with an RTS in mind; specifically it evolves pathfinders good at maneuvering through square grids on which individual squares are either unpassable, or passable with varying penalties.
For our purposes, maps are square grids 40x40 in size. Moving from one square in the grid to another constitutes a single move, and upon moving the agent acquires a penalty in the range 0-9 for that square. Additionally, squares can be impassable, and moves outside the map are impossible. If an agent attempts an illegal move, it immediately "dies" and effectively fails.
Hampton allows you to create custom maps and can generate random maps (which we'll see when talking about adaptation).
In order to evaluate the programs in a given generation, we need a way to score a program's performance. While the specific fitness function used will vary depending on the specific application, a simple method, and the one used by Hampton, is to simply run the program on the map and sum the number of moves it makes with the penalties it has accumulated along the way. Because we are looking for programs that ultimately reach their goal, we penalize programs that fail to do so by adding a substantial value - a "death penalty" if you will - to their final score (a value which will assuredly put the failing program's score outside of the range of scores a successful program might achieve).
Once this process is complete for all programs in the population, we can easily find the most fit program by simply sorting the programs according to fitness, and selecting that program which has achieved the lowest score.
Of primary importance is the syntax of the language which GP uses to build candidate programs. LISP is common for genetic programming, although ultimately it will depend on the application. Most importantly, it should have a structure that lends itself to reproduction and mutation. Programs that can be represented as trees work well for this, as we will soon see.
For the Hampton project, I developed a specific language consisting of a set of terminals and functions. The functions are questions which the program can ask about its current situation, while the terminals are the moves it can make once it has asked the questions it deems necessary. Since we want our agent to behave realistically, we strive to give it access only to the information and moves a human player would have access to. For example, we limit its environmental knowledge only to a few squares immediately ahead of it - to see more, it will have to physically move to another location and begin asking questions anew.
set of functions and terminals is shown below. The function set
developed gradually, as earlier function sets were determined to be too
limiting. For example, the isGoal* functions were added when it was
recognized that programs performed better with a sense of direction.
Without these functions, agents would wander around in circles until
they blindly hit their goal. The doBoth function was added to give
programs memory, and will be discussed in detail when the topic of
memory is addressed.
|isWaterAheadX||Detect water within X squares|
|isEnemyAhead||Detect enemy units one square ahead|
|isBlockedAheadX||Detect any roadblock within X squares|
|isSteeperAhead||TRUE if penalty ahead greater than current penalty|
|isLessSteepAhead||TRUE if penalty ahead less than current penalty|
|isGoalLeft||test relative direction of goal|
|isGoalRight||test relative direction of goal|
|isGoalAhead||test relative direction of goal|
|isGoalBehind||test relative direction of goal|
|doBoth||Execute both children|
Our set of terminals (moves) is much simpler. An agent can either move forward one square, or rotate itself left or right by one quarter turn. These nodes are represented as moveForward, turnLeft and turnRight.
Once our language is defined and the fitness function determined, we need to be able to quickly and easily run the program against the map and determine its score. This is where we see the benefit of requiring programs that can be represented as trees. Programs are run by traversing the tree until a terminal is reached, and then returning to the root node and repeating, until either the goal is reached or the program makes an illegal move and dies.
Because the function set is composed of only boolean functions, our trees are binary. When a function node is encountered, the system determines the answer to the question, and follows the right branch if the answer is yes, or the left branch if the answer is no. Once a terminal is read, the move is made and program execution continues again from the root.
The exception to the previous discussion on processing function nodes is the doBoth function, which was added to the function set in order to give programs memory. When the interpreter meets a doBoth terminal, both of its child nodes are executed. To prevent programs from executing in multiple threads, at least one of the children of a doBoth function node is required to be a terminal.
Effectively, this allows a program to inspect the landscape ahead of it, make a turn (or a move, if that seems reasonable) and ask questions about its new situation, without forgetting what was asked about its previous situation. If the justification for giving programs memory is not immediately clear, consider the situation where an agent analyzes a position and opts to turn left. If each of the remaining three directional orientations is worse than the first, the agent must either choose a worse direction to move forward, or enter a loop state. After all, without memory, once the agent returns to its original orientation, it has forgotten it has been there before and will again turn to a new orientation.
The effect of adding the doBoth function to the function set was impressive - programs began to actually demonstrate recognizable strategies. For example, programs without memory were prone to falling into traps such as long hallways, but with memory they quickly adopted wall-hugging strategies which allowed them to enter a hallway and find their way out of it. Memory also allows programs to return to previous locations and take a path which it previously passed up. Without memory, returning to a previous location could only result in another loop condition.
There are other ways to give your programs memory. You could set up a small memory register, for example, and give the program some means of writing and reading from it. Or you could represent your programs as finite state machines, and give them a stack to use to store information about their environment (essentially creating a pushdown automaton).
I can't imagine what advantage one approach would have over another, but the idea of embedding memory into the program itself (via a function like doBoth) seems to be the simplest and most elegant solution.