As a framework designer, you have a lot of things to worry about. Calling conventions, size compatibility, structure layout, etc. I’d like to briefly talk about another thing to worry about: memory management. I’m not just talking about “please don’t leak memory.” I’m talking about keeping a proper division of labor, and why it’s crucial for frameworks.
We’re all used to the idea of division of labor when it comes to memory management for our applications (though it’s called by many different names). The basic idea behind it is: whatever entity allocates the memory should be responsible for freeing the memory. Of course, there are times when we violate this rule by assigning ownership to other entities. But by and large, you try to avoid code that looks like this:
void One( void ) { void *mem = ::malloc( 32 ); Two( mem ); } void Two( void *mem ) { // Do something interesting with mem // Then free it ::free( mem ); }
We avoid this pattern because it makes it very difficult to quickly understand who is responsible for what, and that can lead to memory leaks. However, when it comes to designing a framework, we have to be much more strict about adhering to this rule.
In frameworks, call boundaries between the library and the application have to have strict control over memory ownership. Whichever side allocates the memory must be the one to free it, or else you’ve got a nasty bug on your hands. This is because each side has access to its own memory allocator and if they don’t agree, problems happen. Imagine this scenario:
// In the library code static void *LibraryAllocator( size_t size ) { // sMyHeap is a heap handle created elsewhere in the library with // a call to ::CreateHeap (these are Win32 heap management APIS) return ::HeapAlloc( sMyHeap, 0, size ); } char *LibraryCall( void ) { char *mem = (char *)LibraryAllocator( 32 ); ::strcpy( mem, "Hello world" ); return mem; } // In the application code static void DoSomething( void ) { char *mem = LibraryCall(); ::printf( "%s\n", mem ); ::free( mem ); }
This is pretty much guaranteed to crash on you because there is little chance that the CRT free function is using the same heap management APIs as the library’s call to HeapAlloc.
However, it can be more tricky to track down as well. Imagine the situation where the library calls malloc and the application calls free. If they’re both linked against the same CRT, life will be fine (for instance, using the shared library version of the CRT from both apps). However, there’s nothing that obligates the application from using the same version of the CRT as the library. For instance, the library could be using the shared library version, and the application could use the static library version. Or both could use their own static library versions, etc. If the libraries are mismatched, then it’s possible for versioning issues. Imagine if CRT v2’s malloc put a magic value after allocations to catch heap corruption when calling free, but CRT v1 didn’t have that feature. If v2 allocates and v1 frees, there may not be an issue. But if v1 allocates and v2 frees, then v2 will think all allocations have heap corruptions because there’s no magic value present.
The best practice for memory management in frameworks is to have the allocating side also be the deallocating side. So either have the caller pass in a memory buffer (and size!) that the library fills, or have the library provide a “free” function for the caller to use to free library-side allocations. Personally, I prefer the former whenever possible as it reduces the chance for problems. For instance, with a library-provided free function, there’s still the chance the caller may pass in memory it allocated by accident and you’re back to the same problem. By pushing memory management onto the caller, you do make their life slightly harder to manage memory, but you also make their life easier by removing the possibility for mismatch problems.
Hi,
I’m a programer using c/c++ and I learned a lot of interesting things from your blog posts. Keep the good job. Thank you.