This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.
[In this technical article, originally printed in Game Developer magazine late last year, veteran game programmer Llopis looks at data baking - that is to say, the steps that 'take raw data and transforms it into something that is ready to be consumed by the game'.]
Who doesn't like a warm cookie, straight out of the oven? Cookies are created from their raw ingredients (flour, butter, sugar, eggs), which are mixed together, portioned into bite-sized pieces, and baked in the oven until everything comes together just so.
The baking process transforms a set of ingredients that are rather unappealing by themselves, into a delicious and irresistible treat.
Data baking (also called conditioning) is very similar. It's a process that takes raw data and transforms it into something that is ready to be consumed by the game.
Data baking can range from being a complex and involved process that totally transforms the data, to a lightweight process that leaves the data in almost its original format.
Since it's an internal process, totally invisible to the player, data baking rarely gets the attention it deserves. As it turns out, the way data is baked and loaded in the game can have a profound impact not just in your game architecture, but in the development process and even the player experience.
The main goal of data baking is to achieve very fast loading times. The player will have a much smoother experience by being able to start the game right away, and team members will be able to iterate and try different options if the game loads in two seconds rather than if it takes a full minute.
There are also some secondary goals achieved by good data baking: data validation, minimizing memory fragmentation, fast level restarts, and simpler data streaming.
Loading times for a game are determined by two things: disk I/O time and processing time. Inefficient disk I/O patterns can dominate loading times, taking close to 100 percent of the full load time.
It's important to make sure your disk I/O operations are efficient and streamlined before we start thinking about gaining performance from baking data. Make sure to minimize seeks, avoid unnecessary blocking, lay out data sequentially, and the rest of the usual best practices for efficient disk I/O.
What follows is a summary of an ideal baking process from a high level.
These steps happen offline:
Which leaves only the following steps at runtime:
Notice that we've done all the heavy lifting during the baking process offline, and the steps performed at runtime are very simple and very fast.
This illustrates what I used to call the Fundamental Rule of Data Loading: Don't do anything at runtime that you can do offline. Can you generate mipmaps offline? Can you generate pathfinding information offline? Can you fix up data references offline? You know the drill. This rule reflects the fact that it is often much faster to load data than it is to do processing on it.
However, that has changed to a certain extent with the current generation of hardware. Disk bandwidth hasn't increased much, but the amount of memory and the available CPU power has gone up significantly.
So we can amend the previous rule to allow for the possibility of trading some CPU computations for a reduction in data size when possible. For example, you might want to decompress data at load time, or even generate some data procedurally while other data is being loaded.
This process is an ideal one to shoot for, but it's not always possible. On a PC for example, we might not be able to know the exact memory layout of our data or how the compiled version of our graphics shaders will be, so we might have to do some processing at load time.
Also, with an established code base, it might be impractical to take the rule to an extreme because it could take a large amount of manpower to precompute everything possible. In that case, we're trading some amount of optimal data loading for faster development.
Just don't overdo it because if the hit on loading times is significant, your whole development will suffer from slow iteration times.