# Aces High: Numerical Techniques in Poker AI

**September 30, 2010**

**Page 1 of 4**

*[Game programmer Simon Tomlinson ( Need For Speed: Shift) analyzes techniques for working with poker AI in this advanced technical article, offering up tricks for generating AI behavior in the well-studied game.]*

This article looks at the use of numerical techniques with application to poker AI, both in off-line balancing and in-game decision making. If you are not familiar with No Limits Texas Hold'Em Poker, the game rules and basic AI structure are discussed in a companion article -- Poker AI: A Starting Point, which is live now on Gamasutra's sister site, GameCareerGuide.

However, two significant areas were not described there in full: how to calculate hand win probabilities, and how to balance the AI characters which are essential to realistic game play. Both problems can be attacked using numerical optimization techniques. Indeed these techniques can be used in a wide range of AI balancing and live decision making applications.

**Numerical Techniques Overview**

There are three broad classes of numerical computing techniques: integration, root finding and optimization. Integration is usually associated with physics engines -- primarily integrating equations of motion over time. Root finding is the process of discovering the zero values in a function of one or more dependent parameters.

Optimization is finding the minimum (or sometimes maximum) of a value which is a function of one or more parameters. It is important to realize that the numerical problem need not be a real life physical equation -- it could be a parametric model that has developed to represent some situation. Here we are concerned mainly with optimization.

There are many well-known solution algorithms for numerical problems. Root finding and optimization problems generally proceed by calculating an approximation to the solution which is an improvement of a previous guess, and repeating the cycle (or iteration) until some acceptable accuracy level is reached (convergence).

These methods are further sub-divided in terms of the "order" of the solution. Higher order solutions use derivatives of the problem formula in order to calculate the next solution approximation more accurately -- thus higher order algorithms converge in fewer iterations.

However there is a trade off: the cost of calculating the derivatives could be higher overall than using a lower order algorithm over more iterations. This can be mitigated by using hybrid techniques where higher order derivatives are approximated using partial differences -- the difference between two previous formula calculations divided by the known change in the dependent parameter.

It should also be noted that an optimization problem can always be converted to a root finding problem by differentiation which can in some cases make it easier to solve. Of course when the equation has many dependent parameters the problem becomes significantly more complex.

By now some readers might be getting a little worried about using these techniques due to their complexity. However there is hope. There are some very simple zero order methods which are easy to understand, simple to implement and which rely on using modern computing power to solve problems in acceptable timescales.

The simplest conceptual technique is to change each parameter by a small amount and evaluate the function, keeping the change if the new value is closer to the target (root, minimum or maximum) and discarding it otherwise. However we can easily do better than that.

The "Monte Carlo Technique" (MCT) relies on random numbers and the statistics of "random walks" to progress towards the solution on a timescale significantly shorter than the step by step search. Essentially this technique chooses one or more parameters from the set at random and changes them by small random amounts before re-calculating the function and retaining improvements.

Genetic Algorithms can be thought of as an extension of this technique where more than one current solution approximation is stored in memory (the parents) and a group of new candidate solutions (the children) are generated either by randomly mixing the parents or adding in further random "mutations". All the children are evaluated and the best two or more are retained as parents for the next generation.

This is an improvement over MCT because it is keeping open more options per cycle and therefore allowing for the existence of clusters of near optimum parameters in each group but there are special considerations for setting up a GA to maximize efficiency.

**Shaping the Problem Space**

Before building a solution it is worth considering the nature of the problem being solved carefully. In many cases good design here can make the numerical algorithm far more efficient. The biggest single consideration is the number of parameters. The fewer the number of parameters the faster the algorithm will converge.

If the number of parameters is very large higher order algorithms can become untenable because of the sheer number of partial derivative permutations. Constructing the parameter space efficiently for numerical methods can often affect the design of the whole AI system, so think about this early. I usually try to use a one dimensional array of real values, which can be copied, or re-interpreted through the use of a Union. This keeps the numerical algorithm generic.

It is also useful to pre-scale the parameters so that they all have a similar magnitude. This helps because the solution algorithm can be more agnostic as to the details of the problem function and therefore apply the same parameter change magnitude to any parameter.

More importantly the numerical solution is more stable where similar changes in the input parameters produce a similar change in the function value. To understand this think about the converse case -- if one parameter produces only a tiny change in the function for a large change in value it will tend to drift around in the noise as other parameters dominate.

**Page 1 of 4**