You can only deal with alignment in low-level languages such as assembly or C. Higher-level languages often don't expose exact memory locations or provide any means to change them, making all sorts of opportunities for optimization impossible.
In C/C++, finding the alignment of some data is very easy. Just examine the least significant bits of the memory address where the data is located.
Memory alignment restrictions are almost always based on powers of two, so to detect whether something is aligned on a particular power of two, we can "&" the memory address with a particular bit mask and check for any non-zero bits.
For example, the following code checks if some data is aligned on a 16-byte boundary: ((uintptr_t)&data & 0x0F) == 0. Notice the uintptr_t cast to be able to perform a bit operation on a memory address.
Casting the pointer to an int instead is almost guaranteed to cause problems in the future because it assumes that pointers and ints are the same size, which is not the case in most 64-bit architectures.
Generalizing that technique, Listing 1 (found in this ZIP file) shows a function that verifies the alignment on a particular power of two boundary. We'll be using a lot of bit tricks when dealing with memory addresses at this level, so make sure you're nice and comfy with all that bit twiddling.
Setting the alignment for a piece of data, unfortunately, is not nearly as easy. The C/C++ language falls short again, without a standard way of dealing with alignment.
Not only that, but we'll need to use different strategies to align data depending on how it is allocated: statically, on the heap, or on the stack. We're left at the mercy of compiler-specific extensions and our own home-brewed solutions.
Static variables are those not allocated on the heap or the stack. That includes global variables and class static variables.
The two most common compilers we have to deal with as game developers (Visual Studio and gcc) offer solutions to allow us to align data at a particular memory boundary. In both these cases, the alignment has to be a power of two.
Visual Studio provides the __declspec(align(#)) extension. Any variable or type tagged with that extension will be correctly allocated on the specified alignment. gcc also provides its own extension, but of course, with different syntax: __attribute__ ((aligned (#))).
We can create a macro that allows us to specify alignment attributes in a platform-independent way. Unfortunately, __declspec(align()) precedes the symbol it modifies, and __attribute__((aligned())) comes afterwards, making the macro slightly uglier because it needs to wrap the symbol (see Listing 2 in this ZIP file). With this macro we can now write DATA_ALIGN(int buffer, 128); in any platform.
Sometimes we want to align all instances of a particular type on a certain boundary.
For example, a 4x4 matrix class that we're going to manipulate with SSE instructions always needs to be aligned on a 16-byte boundary. In that case, we can apply the alignment attribute to the type, not just to specific instances.
__declspec(align(16)) struct Matrix4x4
Specifying the alignment of a member variable forces the compiler to align it relative to the start of the struct (or class) it belongs to.
Going back to our previous example, if we add a Matrix4x4 structure to a Waypoint structure, the compiler will pad the Waypoint data so that our matrix is a multiple of 16 bytes from the beginning of the structure (see Figure 1, right).
But what if the Waypoint structure is placed at a random memory location? Does that mean that all bets are off as far as the absolute alignment of the matrix variable?
Fortunately, the C standard comes to the rescue here because it states the following: the alignment for a struct has to be a multiple of the least common multiple of the alignments for all of its members.
That's quite a mouthful, so a simpler way to look at it is that, if alignments are always powers of two, a struct will be aligned the same way as the member with the largest alignment.
That is really good news because now the Waypoint struct is guaranteed to be aligned on a 16-byte boundary, which means that the matrix member variable will also be correctly aligned on a 16-byte boundary.