Using Particle Swarm Optimization for Offline Training in a Racing Game
June 23, 2017
Press Releases
June 23, 2017
Games Press

If you enjoy reading this site, you might also want to check out these UBM Tech sites:

Using Particle Swarm Optimization for Offline Training in a Racing Game

December 13, 2005 Page 2 of 2

System Parameters

There is no hard and fast rule as to how many particles should be used to solve a specific problem. A large number of particles allows the algorithm to explore the search space faster, however, the fitness function needs to be evaluated for each particle, so the number of particles will have a huge impact on the speed at which the simulation will run. Generally speaking, as the complexity of the search space increases, so should the number of particles.

The inertia weight, w, in the velocity vector update equation, is a scaling variable that controls the influence of the previous velocity when calculating the new velocity. Inertia weight values larger than one will typically cause the particle to accelerate and explore larger regions of the search space, while smaller values will cause the particle to gradually slow down and do a finer search of a region [Van den Bergh 2002]. Many algorithms tend to decrease the inertia weight over time, allowing particles to initially roam a larger area in search of optima, and then to gradually do finer searches [Engelbrecht 2002].

An early addition to the basic PSO algorithm was to place an upper limit on the velocity of a particle to prevent particles from moving too rapidly through search space. Clerc and Kennedy later proved that multiplying the velocity vector with a so-called constriction coefficient made velocity clamping unnecessary [Van den Bergh 2002]. The constriction coefficient is a factor of the local and global component variables, for which the sum of the two has to be larger than four for the rule to apply.

The local and global component variables, l and g, control the influence of the personal best and global best positions respectively. They are defined as l = r1 c1 and g = r2 c2, where r1 and r2 are random values between 0 and 1. c1 and c2 are acceleration constants that are usually set to values close to 1, and it has been shown that the following equation must be satisfied to ensure convergence:

w > ½ (c1 + c2) – 1

### Neighborhoods

The standard algorithm as described above is also called the global best (gbest) algorithm, because the globally best particle is followed by all the particles. The local best (lbest) algorithm, however, follows more closely the original model created by Reynolds. It differs from the gbest algorithm in that the whole swarm is divided into a number of neighborhoods, where each particle is only aware of the particles in its neighborhood. There is no global best particle, but rather a local best particle for each neighborhoods. Each particle is therefore only influenced by its neighbors, and not the whole swarm. The gbest algorithm can be seen as the lbest algorithm with one neighborhood consisting of all the particles.

The size of neighborhoods plays an important role in the lbest algorithm. It has been shown that smaller neighborhoods result in slower convergence, but generally lead to better results, since a larger part of the search space is explored and particles are less likely to be trapped in local optima [Engelbrecht 2002].

A few adaptations to the lbest algorithm have been suggested [Van den Bergh 2002]. Suganthan based his neighborhood assignment on physical proximity rather than the index of a particle. Particles that are close enough to one another are said to be in the same neighborhood, and this is continually evaluated so that neighborhoods change over time. Typically, smaller neighborhoods are formed initially, with fewer and larger neighborhoods forming as the algorithm progresses. This allows particles to search larger areas initially and helps to avoid premature convergence (i.e. a particle getting stuck in a local optima instead of a global optima). Kennedy proposed a social stereotyping method where particles follow centroids of neighborhoods rather than individual particle positions. This originated from the idea that people in a group would rather follow the collected beliefs of the group than the beliefs of one individual in the group.

Many other variations and improvements have been made to the basic PSO algorithm over the last decade, but these are beyond the scope of this article. More information can be found in the references listed at the end of the article.

Training AI Racing Parameters

Like most other racing games, our game consisted of a player competing against several computer opponents racing around a track. Since the AI opponents drove the same vehicle as the player, their competitiveness was based purely on the decisions they made and the way they handled their vehicles. The familiar technique of a racing line was used to portray to the AI what the layout of the track is and what a generally good line would be to drive along. Using the racing line as a guide, the AI still had to make decisions on when and how much to brake for a corner, how much to correct for over- or under-steering, etc. These decisions were driven by parameter values as specified in an external data file for each AI. To arrive at good values for these parameters, we used the PSO algorithm to train them.

 Figure 3 : The racing line is shown here floating above the track.

The PSO algorithm was used to train 10 different parameter values simultaneously, some of which were mentioned above. The simulation was set up with 20 particles, each represented by an AI with randomized parameter values. All the AI drove the same vehicle, and inter-object collisions were disabled so that AI cannot affect one another. For each epoch of the simulation, the AI started at the exact same position and raced around the track and the fitness of an AI was the total time taken to complete a lap. Since some AI drove so terribly that they never completed a lap, a limit was set on the amount of time an AI could take to complete a lap. The simulation was set to run for a maximum of 100 epochs.

Because it was quite a high-dimensional search space, the algorithm would sometimes settle in local minima and yield poor results even after the maximum epochs have elapsed. Other times, however, good results were obtained within the first few epochs of the simulation. We ran quite a few simulations and chose the best results out of these.

Driving around the track myself I achieved a lap time of about 70 seconds. In one of the PSO simulations, the initial best time (fitness) achieved was somewhere around 80 seconds, with the average lap time of all the particles resulting in a very poor 115 seconds. After 100 epochs however, the best lap time was about 63 seconds, and the average lap time round about 65 seconds. Using the parameters of the best particle and putting them back into the game, I found it near impossible to beat the AI.

Typically, in a racing game, it is desirable to have AI opponents of various difficulties. Since each particle represents a set of parameter values, particles with varying fitness can be chosen and used within the game to achieve this.

Since the fitness evaluation of the simulation depended on AI completing a whole lap, simulations were quite lengthy. To speed the simulation up, rendering and frame rate clamping was disabled to try and get the simulation to run as fast as possible without negatively affecting physics and AI computations.

Training Physics Parameters

Once we had trained the AI parameters for our game using the PSO algorithm, we realized that it would also be a useful technique to train the physics parameters for our vehicles. Specifically, we needed to obtain spring values such that the car's wheels would both settle at the right height, and the car's physics would remain stable.

The PSO algorithm was used to train the spring values of each of the vehicle's four wheels. The simulation was set up with 16 particles, each represented by a vehicle with randomized parameter values. Once again inter-object collisions were disabled. For each epoch of the simulation, all the vehicles were moved to the same position a few meters above the ground and then dropped. After a few seconds, the fitness of each of the particles was measured based on how close the car's wheels were to their ideal positions and how quickly the car had settled down.

Generally, the physics training led to good results and the trained physics parameters provided a car with stable physics that could then be driven around and allow us to do minor manual tweaks.

Conclusion

The Particle Swarm Optimization algorithm has been in existence for the past decade, but is still relatively unknown in the gaming community. It is an easy algorithm to implement and can be used to solve the same problems that Genetic Algorithms are generally used for. We used the PSO algorithm to train both physics and AI parameters for our racing game and found that it gave good results that needed little or no amount of manual tweaking.

If you would like to find out more about the PSO algorithm, you can visit the Computational Intelligence Research Group's website.

References

[Engelbrecht 2002] Andries P. Engelbrecht, Computational Intelligence: An Introduction, Wiley, 2002.

[Rabin 2002] Steve Rabin, AI Game Programming Wisdom, Charles River Media, 2002.

[Van den Bergh 2002] Frans van den Bergh, An Analysis of Particle Swarm Optimizers, PhD Thesis, Department of Computer Science, University of Pretoria, 2002.

_____________________________________________________

Page 2 of 2

### Related Jobs

Tangentlemen — Playa Vista, California, United States
[06.22.17]