| |
|
|
||||
![]() |
||||||
| |
|
|||||
|
Editor's note: This paper was originally published in the 2000 Game Developer's Conference proceedings
3.1 Game Trees And Minimax Search The root of a game tree represents the current state of the game. Each node in the tree can have any number of child nodes. Each child of a node represents a new state after a legal move from the node's state. This continues until we reach a leaf, a node with no child nodes, in the game tree. We assign a payoff vector to each leaf in the game tree. In a generalized game tree, this payoff vector represents the utility of the final position to both players. In general, winning a game represents a positive utility or a player, while losing a game represents a negative utility. Since the game is a two-player zero-sum game, the utility for the first player must equal the negative of the utility for the second player. The utility for the side to move at the root of the tree is usually the only one given to save space. In Figure
2, an example of a game tree for a game of Naughts and Crosses (or Tic-Tac-Toe)
is given. Note that the two players take alternating turns at different
levels of the tree. X moves
at the root, while the opponent, O, moves at the first level below the root. A position is normally categorized by how many levels down in the game tree it is located. The common term for this is ply. The root is said to be at ply 0, while the immediate successors of the root are said to be at ply 1, et cetera. Naughts and Crosses, like chess and checkers, has only three possible outcomes for a player: win, loss or draw. Normally, we assign the payoff of +1, 0 and -1 to a win, draw or loss for the player to move, respectively. These payoffs are given in Figure 2 at the bottom of each leaf position, with respect to the player with the crosses. We will give names to each player to simplify our discussion. Let us call the player to move in the initial position Max and the opponent Min. At each node in the tree where Max has to move, Max would like to play the move that maximizes the payoff. Thus, Max will assign the maximum score amongst the children to the node where Max makes a move. Similarly, Min will minimize the payoff to Max, since that will maximize Min's payoff. The maximum and minimum scores are taken at alternating levels of the tree, since Max and Min alternate turns. In this
way, all nodes in the tree can be assigned a payoff or minimax value,
starting from the leaves of the tree and moving up the tree towards
the root. In Figure 3, we give minimax values for all nodes in
our Naughts and Crosses game tree (Figure 2). These minimax values tell
us what the best possible outcome for Max is in any position within
the game tree, given that Min will do its best to foil Max's plans.
variation.
Note that in our game of Naughts and Crosses, the side playing the Crosses
will draw the game, but only if an X is played in the lower central
square. Playing to either square in the top row can lead to a loss for
the Crosses, if the opponent plays the best move. In Listing 1, we have left out some of the details. For example, we have not defined what a position is, since this is game-dependent. There are three additional functions that would be required to implement the minimax search: (1) EndOfGame, which determines whether the game is over at the input position, returning TRUE if the game is over; (2) GameValue, which accepts a position as a parameter, determines who has won the game, and returns the payoff with respect to the player Max; and (3) GenerateSuccessors which generates an array of successor positions (p.succ[]) from the input position, and returns the number of successors to the calling procedure. Note that
Maximize() and Minimize() recursively call one another until a position
is reached where the EndOfGame() function returns TRUE. As each successor
of a node is explored, gamma maintains the current assessment of the
position, based on all of the moves that have been searched so far.
Once all successors have been examined, the minimax value for that position
has been computed and stored in gamma, which can be returned to a higher
level within the tree (please refer to Listing
1). This formulation
requires exactly the same amount of work as the matrix formulation did,
but further pruning can be done on this tree. The
|
|
|