|
Just like in Gulliver's Travels, both formats make perfect sense depending how you visualize memory and data, and there isn't any advantage to using one over the other.
Some architectures use one and some use the other one; just be aware of which format is used in each of your target platforms and format your data accordingly.
As an example, Intel and AMD-based CPUs use little-endian format, whereas PowerPC CPUs (which include the Microsoft Xbox360, Sony PlayStation 3, and Nintendo Wii) use the big-endian format. Some platforms go as far as being able to switch between the two memory formats.
It's not just the CPU that needs to be little-endian or big-endian. Any hardware that fetches multi-byte data types from memory needs to be aware of the format of that data. Most GPUs can work in either mode for that reason and are customarily set to match the CPU format to keep programmers from going insane.
Data endianness is something that programmers only have to be aware of when sharing binary data between different platforms. You might never have to think about data endianness if you're only developing for a single platform. You really don't care in what order those bytes are stored in memory; you just load them into a register and the CPU takes care of fetching them in the correct order.
An example of a common situation in which data endianness is crucial is network communication. Binary data is transmitted over network packets and might be received by very different platforms.
Fortunately, to allow different machines to communicate with each other and interpret the data in the same way, everybody agreed on a standard network format for binary data-big-endian.
The network sockets API provides a set of standard functions to convert long and short data types between the host format and the network format (htonl, htons, and ntohl, ntohs), which do nothing in hosts with native big-endian format, and swap bytes around in little-endian platforms.
Saving Data
As game developers, the most common situation in which we have to deal with byte-endianness is saving and loading data across multiple platforms.
Whether it's because we're baking data on a little-endian PC and loading it on a big-endian console, or because we want save games to work across a variety of platforms, we need to be very careful how we arrange those bytes.
We could take the same approach as network data and just pick one format and transform the data into that format before saving it. Then, if the target platform uses a different byte-endianness, we could swap the bytes around at load time.
That approach would work, but it would add an extra operation at load time that we could have done ahead of time. So we fold that operation into the data baking process.
When we create the memory image for the data we're baking, we need to compare the byte-endianness of the target platform and the building platform. If they're both the same, we don't need to do anything extra, and we continue baking as usual.
If they're different, we need to rearrange the bytes of every data type larger than one byte. Listing 2 shows a function that swaps the endianness of a piece of data for any data type.
Listing 2: Function for Swapping Endianness
Template < typename T >
T SwapEndianness(T& out, const T& in)
{
const unsigned char* src = reinterpret_cast < const unsigned char* >(&in);
unsigned char* dst = reinterpret_cast < unsigned char* >(&out);
for (int i = 0; i < (int)sizeof(T); ++i)
dst[i] = src[sizeof(T) - 1 - i];
}
[Edit: Eric Bernard pointed out that returning a floating point number by value after swapping its bytes, can be loaded in a floating point register, which can cause it to be re-normalized into a float, slightly changing its precision (and sometimes not so slightly). To avoid that problem, it's important that the resulting value of SwapEndianness is passed as a reference (or pointer) and not returned by value.]
It's useful to note that data endianness is a completely orthogonal concept to the way the data is represented. Both a 32-bit integer and a 64-bit floating point number are going to be stored MSB-first in a big-endian format.
This will make our job a lot easier when converting data for specific platforms because we can first convert the data to the correct representation, then convert them to the right data endinanness, and finally apply any padding rules.
With these new tools in hand, we can now deal with different data sizes, padding, and byte-endianness and create perfect data memory images for just about any platform. Happy baking!
Resources
Msinttypes
http://code.google.com/p/msinttypes
C++ standard
Required reading for low-level C++ issues
www.open-std.org/jtc1/sc22/wg21/docs/projects
|