Data Alignment, Part 1
March 6, 2009 Page 3 of 3
You would hope that tagging a type with __declspec(align(#)) (or the gcc equivalent) would also align the object correctly when it's created on the heap. No such luck.
Alignment attributes are only used at compile-time, and the C runtime knows nothing about them when you request to allocate a particular type from the heap. Fortunately, we can take control of heap allocations and force the alignment ourselves.
In most cases, malloc will return a 4-byte aligned block of memory. We can create our own version of malloc, aligned_malloc, that allocates memory with any alignment by building on top of standard malloc and pad the front of the memory block until it's aligned correctly.
The only catch is that we need to remember to call our version of free, aligned_free, on that memory when we're done with it.
aligned_malloc will allocate a block of memory large enough, find the first address within that block that has the correct alignment, and return that address. aligned_free needs to call standard free with the originally allocated memory block.
So we need a way to go from our aligned pointer to the original pointer. To accomplish this, we'll store a pointer to the memory returned by malloc immediately preceding the start of our aligned memory block.
The implementation for aligned_malloc and aligned_free is shown in Listing 3 (found here). There are a couple of interesting implementation details:
- We don't know what alignment malloc will return, so we need to account for the worst case. That means that our block has to be large enough to hold the requested allocation size, one pointer, and up to the full alignment size (minus one).
- More bit fun: To find the next address with a particular power of two alignment, we can add the full alignment minus one, and then round it down to the nearest alignment (by masking off the bits corresponding to the power of two of the alignment).
Some platforms implement their own version of aligned_malloc. So if you don't need cross-platform compatibility, you should look at what your platform provides. For example, Microsoft has a function called _aligned_malloc, and other platforms have the semi-standard function memalign, both of which work in a very similar way to the version we just implemented.
Our aligned malloc and free are perfect for allocating plain memory blocks, but what about creating objects with a particular alignment? Next article, we'll cover that and we'll explore the mysteries of stack alignment.
Page 3 of 3