Instant Replay: Building a Game Engine with Reproducible Behavior

By Patrick Dickinson

It is common for games to offer a 'replay' feature. This feature allows the player to record a sequence of game play and then watch it over again, perhaps from a different viewpoint, or in slow motion. The player may be able to save the recording to disk or memory card, or even transmit it to other players.

When faced with the task of implementing this feature, two different approaches become quickly apparent. The first solution is to store absolute information about all of the objects in the game world (including the player) on a frame by frame basis, or at fixed frequency. This would include data such as position, orientation, and so on. The replayed sequence is then constructed by streaming this information back into the game engine, and interpolating where necessary.

A second and much more elegant approach is to make use of the inherent predictability of computer software: The same sequence of operations performed on the same set of data will always produce the same result. It is reasonable to conclude that a sequence of game play may be precisely reproduced by recording only the initial state of the game, along with the player's inputs. The initial state can then be restored, and the recorded inputs reapplied, to produce the same sequence of play. This solution is instantly more appealing because the amount of data that needs to be stored is much smaller. It is also simpler to implement and maintain, as all necessary coding can take place at the 'player input' level, and remain independent of the underlying game engine.

If you've ever tried to implement a replay feature using this method, then you will know that life is not quite so simple. Despite the fact that the same program is running, with same inputs, things just don't happen exactly the same way the second time around. This is because game software does not run in isolation. It makes extensive use of externally generated run-time data which is somewhat less predictable, and undermines the natural reproducibility of the game engine itself. Just one small difference in this external data will cause the replayed sequence to diverge from the original, potentially resulting in completely different game events. This is often an insurmountable problem. Game software is highly complex, and the task of isolating and dealing with externally generated data can be overwhelming.

However, if external data is isolated at the start of development it can be done effectively and with minimal effort. Moreover, the ability to precisely reproduce a sequence of game play in this way offers many other benefits to development beyond implementing a replay feature. If you have reproducibility, then you have a way of reproducing even the most obscure bug discovered by your testers. More significantly, you also have the foundation of a low bandwidth networking solution.

This article is about building such a game engine. An engine which can record a player's inputs, reapply to them to the same initial state, and precisely reproduce minutes or even hours of game play just as reliably as a video recording. Reproducibility effects most of the components of a game engine, and is characterized by behavioral criteria rather than particular algorithms. For this reason, the focus of this article is to present ideas and design techniques which can be integrated into any game engine, rather than specific code. Most of these ideas have evolved over time through discussions with other experienced game programmers, and proved beneficial to many projects in different ways, so I will discuss how reproducibility can be used to implement other game features, such as networking. I will also highlight the main implementation problems that both I and others have encountered, along with practical solutions.

The Benefits

Maybe you are not convinced yet. Maybe you don't think that you need or even really want reproducible behavior in your software. So before continuing I shall review the benefits that reproducibility has to offer.

  1. Action Replays. If you spent more time watching the action replays in "Gran Turismo" than actually playing the game, then you will need no more convincing. This game feature represents exactly what we are trying to achieve.
  2. Debugging. Being able to reproduce bugs quickly and reliably offers big time savings to programmers. Almost every game programmer has at sometime spent hours or even days just trying to reproduce an obscure bug. Being able to automatically reproduce these kind of bugs saves time, your sanity, and your company's money.
  3. A Low-Bandwidth Networking solution. This is potentially the biggest plus-point to reproducibility and so merits a more in-depth discussion later in this article.

Towards Reproducibility : Inputs & Outputs
I have already mentioned that programs are inherently "deterministic". The same inputs applied to the the same starting state will produce the same final state. So exactly what is it that makes a game engine not deterministic? To answer this question, we first need to define and identify the inputs, outputs, and 'state' of our game engine.

An input is a datum passed to the game engine at run time from an external source, which is used in some way to modify internal data. An obvious example is a player's input. Conversely, outputs are data which are generated by the engine and passed to an external target, but which are not used internally by the game. An example output might be vertex data passed to the GPU for rendering. All remaining persistent internal data we shall term as "game state". Examples of a game state data are the speed of a car, or the player's position in the world. Having defined these terms, we can set about identifying them in a game engine. Figure 1.1 represents a simple input/output/state arrangement for a typical game engine.

Inputs, Outputs, and Game State

Some items in figure 1.1 are fairly obvious. Others less so, and warrant a more detailed discussion. For the time being we can begin to recognize the sources of non-reproducible effects in our game, and develop a strategy for dealing with them. All inputs (apart from the player's inputs) need to be isolated and either made to be predictable, or de-coupled from the game state. The player's input must, of course, change the game state in some way, and so these inputs are to be recorded and reapplied in the correct sequence and at precisely the same time that they were originally applied.

If this strategy is successful then during play the game state will undergo numerous state changes, progressing from state to another through the passage of time. When we view a replay of this sequence, the game will start in the same initial state and progress through exactly the same transitions (with the same player inputs reapplied), passing through exactly the same sequence of states, and arriving at exactly the same final state. Let's be clear about what this means: at the end of the replay, every single item of data in the game state will finish in exactly the same state as it did when the game sequence was originally played. We cannot afford for even one thing to be different at any point in the replay, or else it will diverge from the original. But so long as we isolate all of our inputs, then this state transition will occur correctly, and predictably, every time that the sequence is replayed.

It should be noted that outputs are of no concern to reproducibility. They neither effect the game state, nor the inputs. Once they have been identified we need not consider them any further. However, there are some traps to look out for. For instance, rendering data may actually used by other parts of the game, such as the collision detection system. If this is the case, then this rendering data must be considered to be part of the game state.

All that remains, then, is to deal with each of the game inputs in turn. The first input listed in figure 1.1 is that of the player, and we already have a have a good idea of what we are going to do with that. The next item in the list that requires our attention is 'time'.


Time as an Input

Most games run in real time. The animation produced by the game engine is scaled to readings made from the hardware system timer. Typically, each rendered frame takes a different amount of time to render, and so to maintain smooth, frame-rate independent animation the game engine is parameterised on these timer readings.

A typical way of implementing frame rate independence is to update the game state once for each rendered frame. At the start of each update the system timer is read, and the time that has elapsed since the last update is calculated. This "last frame time" is used to calculate the next game state, and then the next frame is then rendered, and so on. For example, if a car is moving with some velocity 'V' is the game world, and its position at the last rendered frame is 'P1', then its position for the next frame, P2, may be calculated using the simple integration:

P2 = P1 + V * last_frame_time

You may not have thought of time as being an input into your game engine before, but looking at the above equation you can see that the result of each game state transition is determined by the last measured frame time. The sequence of game states that the engine passes through is directly dependent on time readings from the system timer. Even though the animation is independent of frame rate, this dependency will undermine the reproducibility of the game engine because we cannot rely on getting the same frame times during the replay.

For example, imagine that our engine renders two frames, and that the frame-time for each update is 30 milliseconds. This will result in two state changes, each representing a time update of 30ms. Now imagine that we wish to replay this sequence, as seen from a different viewpoint in the world. From this new viewpoint, we can see fewer world objects, and consequently our engine renders at a higher frame rate. Perhaps a single frame takes only 20ms to render. Instead of generating 2 frames of 30ms, our frame rate independent engine now generates 3 frames of 20ms. The result of this is that our car's position is now slightly different. Even though it has been moving for exactly the same amount of time, inaccuracies in the integration of its velocity mean that its position is marginally different. In the replay, the car is not passing through the same set of states as it did the original sequence. Aside from this, we also have a problem reapplying the player inputs. If the inputs are read once per frame, then we cannot apply them at exactly the same time in the replay as we did when they were originally applied.

Dealing with system time readings is thus critical to reproducibility. Even when running a replay on the same computer that it was generated, there is no guarantee that the same timer readings will be generated. Even making a small change to the view point will result in different readings, and therefore a different sequence of game states. This problem is exacerbated when running a PC based game, because the OS may be executing other external processes, or the player may, perhaps, decide to upgrade his or her graphics card to achieve a higher frame rate!

Unfortunately, timer readings cannot be eliminated from the game, or even made to be predictable, if we are to retain frame rate independent animation. However, the game state may easily be de-coupled from the system time. Perhaps the simplest way to do this is to quantise the state execution time, by updating the game state at a fixed frequency. Game updates can then be controlled with a small time control loop. This loop reads the system time, and decides how many fixed-time state updates to execute. When the required number of updates have been applied, the frame is rendered. This of course means that several state updates may occur to produce a single rendered frame of animation.

The C-like pseudo code for such a system is shown below. In this case, the game state is updated at a frequency of 100Hz (10 millisecond increments).

static int execution_time; //time to be executed this frame
static int num_updates = 0;
static const int update_time = 10; //miliseconds = 100hz;
static bool playing = true;
while(playing) //frame update
{

//see how much time we need to execute for the next rendered //frame : read system timer & find elapsed time
execution_time += LastFrameTime();
while(execution_time > update_time)
{
//read & save player inputs
ReadPlayerInputs();
SavePlayerInputs(num_updates);

//update game state
UpdatePlayer(update_time);
UpdatePhysics(update_time);
UpdateAI(update_time);
//etc…

execution_time -= update_time;
num_updates++;
}
RenderFrame();


}

Notice that any unexecuted time remaining in execution_time is carried over to the next frame update. This is important, because when the game is achieving a high frame rate, no time is 'lost' and frame rate independence is maintained. This schema also gives us a perfect way to record and reapply player inputs. The inputs are read once per state update, and are recorded against the total number of state updates that have been executed at that point. This means that they can be reapplied against exactly the same game state that they were originally applied.

Reproducing a sequence of game play generated in this way thus becomes a simple task. Starting from the same initial state we can execute the replay using the same process of updating the game state at a fixed frequency. Stored inputs can be reapplied by checking the number of updates that have been executed, to see if they match the recorded number of updates of the next input in the list.

even if the frame rate is different during in the replay, the game state will pass through exactly the same sequence of states, and that the player inputs will applied at exactly the same time (and to the same game state) that they were originally. The code needed to execute a replay is shown below.

static bool replaying = true;
while(replaying) //frame update
{

//see how much time we need to execute for the next rendered //frame, as before
execution_time += LastFrameTime();
while(execution_time> update_time)
{
//reapply recorded inputs
if(TimeToApplyNextSavedPlayerInput(num_updates))
ApplyNextSavedPlayerInput();

//update game state
UpdatePlayer(update_time);
UpdatePhysics(update_time);
UpdateAI(update_time);
//etc…

execution_time -= update_time;
num_updates++;
}
RenderFrame();

}

Thus, the replayed sequence will always generate identical game behavior, even if the viewpoint is changed, or the replay is run on a different hardware configuration. It is important to note that all components of the game which modify the game state should be parameterised by the 'game time' only. Under no circumstances should a programmer be tempted to read the system time directly, as to do so and use it to change the game state will result in non-reproducible behavior

In order to reproduce a sequence of game play it is necessary to ensure that the replayed sequence starts in the same initial state as the original. It is almost certain that we will want to record sequences mid-game, and so it necessary to be able to save and reload the game state. In practice this is not difficult to achieve, and it is probable that we will require a load and save game feature anyway. However, it is important to ensure that the loaded game state is identical to the original. If it is not, the subsequent behavior may not identically match that of the original. We have now progressed a good way towards a generic design for a reproducible game engine. However there are still some remaining points to be dealt with.

Other External Data Sources
The input list in the beginning of this feature includes the somewhat ambiguous item 'External Data', which essentially means "any other source of externally generated run-time data", and may include hardware or operating system sources, or software libraries used by the game. These are specific to each individual game, (for example, some games make use of software libraries licensed written by other developers) and so need to be considered on a case by case basis. However, the important thing is that they do not damage the reproducibility of the engine. Any external data source used by the game needs to be reproducible. It is also important that it can be restored by the game when the game state is loaded.

Generally there actually few such sources, but one which is used in most games is the C/C++ random number generator provided by the standard library. This random number generator is in fact reproducible. It will always generate the same sequence given the same seed. However there is a problem with loading and saving the game state.

When a game is reloaded it is clearly necessary to restore the random number generator so that it will generate the same sequence from that point as it did after the game was saved. A tempting solution would be to re-seed the random number generator whenever a save occurs, and save the seed as part of the saved game state. However, this is not a good approach, because it means that the act of saving the game will effectively change the game state, and we wish to be able to save and reload the game freely.

The easiest way to circumnavigate this problem is to avoid using the standard library random number function altogether, and instead use our own. Fortunately there are many pseudo-random number algorithms in existence, which can be coded and included in the game engine; it is then only necessary to save the generator's data as part of the game state.


Floating Point Mathematics

In some cases, the use of floating point math can also cause problems with reproducibility. The problem arises where we wish to reproduce a sequence on a different platform from the one on which it was generated; or at least, a platform with a different hardware floating point unit. On a fixed hardware platform, such as console, this presents no problems. However if we are developing a PC based game then we must be aware that a PC is not a fixed-platform.

Different PC systems utilize different CPUs, and different CPUs have different FPUs. It may seem surprising, but different FPUs can produce marginally different results for some calculations, even though all units are IEEE compliant. This can be attributed to different internal representations within the FPU. In addition, different software builds can exhibit different behavior on the same hardware due to changes in the storage of floating point values. This means, for example, that a sequence replayed through a 'debug' build of a game may exhibit different behavior when played through a 'release' build.

In any case, this leaves us with a problem if we wish to reproduce a game sequence on different PC to the one on which it was recorded. In this respect we wish must consider the results of PC based floating point calculations to be an external data source. The obvious solution to this is to use integer (fixed point) data to represent our game state. This may well be desirable for other reasons, for example if we are planning to port code to another platform which does not have a hardware FPU. Of course, output data (such as rendering data) is independent from the game state, and so free from this restriction.

Using Reproducibility for Networking
Implementing a successful networked game can be extremely problematic. In order to maintain responsiveness it is generally necessary to run a local version of the game world on each machine, and transmit locally generated data to other participants. The inherent latency of any transmission system means that de-synchronisation is unavoidable. Eventually this de-synchronisation will result in an event occurring on one machine that does not occur on another, so it is necessary to implement some system of re-synchronisation.

Many techniques exist to deal with this problem, and almost all rely on the transmission of game state data. Once two different versions of the same world are different, it is generally only possible to re-synchronise them by sending absolute state data. Moreover, this must be an ongoing process, as de-synchronisation can occur at any time. The result is that most network games need to continuously transmit game state data in order to maintain the integrity of the shared game world. This necessarily incurs a large bandwidth penalty, seriously impeding performance across a low bandwidth connection such as a modem.

Using a reproducible game engine offers some big benefits to networked games. Firstly, since the same sequence of inputs always produces the same result, participants can re-synchronise simply and reliably by transmitting and processing player inputs: If two game engines have started from the same state, and processed the same inputs, they will reach identical states. This reduces required bandwidth considerably, as no absolute game state information is involved. Secondly, a network model based on the transmission of player inputs is much simpler to implement. The data structure of all required messages is already clearly defined by the player input structures, and the game engine already has functions for processing received messages. This approach is sublimely simple, and in principal scales to any level of software complexity.

For further reading on this subject, Peter Lindcroft presents an interesting account of a network game based on a reproducible game engine architecture in his article "The Internet Sucks: Or, What I Learned Coding X-Wing vs. TIE Fighter", including some of the problems encountered by his team.

Using Reproducibility for Debugging
From a development perspective, one of the most beneficial uses of reproducibility is as an aid for debugging. Often, a particular bug will occur under very specific conditions and therefore be difficult to reproduce. In fact it is common for some bugs to occur only once in a number of hours of game play. Reproducibility can be used to quickly recreate such problems. This is best achieved by periodically saving the game state, and each subsequent input. It is then straightforward to reload the game state prior to a bug being observed, and then reapply the sequence of inputs to exactly reproduce the problem. This may also be achieved on different machines, of course, which is useful if your game is being tested at a different location.

Some Final Words

Despite some limitations, I have found reproducibility to be immensely useful in my own work, particularly as an aid to debugging. As a design-driven approach it is a classic case of working "smarter" rather then harder: a little planning at the start of development brings big benefits later on. It allows teams to easily add features that would otherwise have been problematic, and to consider other features that would not otherwise be possible.

The key thing to bear in mind is that a game engine either exhibits reproducible behavior, or it doesn't. This means that reproducibility is easiest to implement at the start of development, but becomes increasingly difficult to add later on. If you would like to enjoy the benefits it has to offer in your own projects then it is important to apply some thought and preplanning at the start in order to reap the rewards later on. But then again, isn't that always the way with game development?

References

Peter Lindcroft, "The Internet Sucks: Or, What I Learned Coding X-Wing vs. TIE Fighter"

______________________________________________________

 

 

Return to the full version of this article
Copyright © UBM Tech, All rights reserved