One of the lesser-known features of Visual Studio’s C/C++ compiler are the pointer type attributes __ptr32 and __ptr64. More information about them can be found on MSDN. These pointer type attributes are used to control the visible size and behavior of pointers in 32- and 64-bit applications. Their usage is a bit strange, but if you need to do interop between 32- and 64-bit mode, they can be handy features to have. Additionally, there are the __sptr and __uptr qualifiers which allow you to specify how the pointer types are extended. __sptr denotes sign extension, and __uptr denotes zero extension. Information about these qualifiers can also be found on MSDN.
The behavior of these type attributes is strange, to say the least. I wanted to take a moment to document what I’ve observed the behavior to be, as of Visual Studio 2012.
For starters, the documentation is a bit misleading for __ptr32 and __ptr64. Specifically, the documentation states:
On a 32-bit system, a pointer declared with __ptr64 is truncated to a 32-bit pointer. On a 64-bit system, a pointer declared with __ptr32 is coerced to a 64-bit pointer.
However, this isn’t strictly accurate. When you use a type with a specific pointer size, the returned pointer will be sized accordingly regardless of architecture you are targeting. For instance, if you use __ptr64 on an x86 build, you will get a 64-bit pointer value. If you pass it to functions, use it in structures, call sizeof on it, etc, it will occupy 8 bytes of space, even though you are on a 32-bit system. However, when you dereference the pointer, it will only use the low 32-bits of the pointer. For instance:
#include <stdlib.h> void func( int * __ptr64 p ) { *p = 12; } int main( void ) { int * __ptr64 p = (int * __ptr64)::malloc( sizeof( int * __ptr64 ) ); func( p ); return 0; }
If you compile this code and take a look at the resulting disassembly, you will notice some interesting behaviors:
mov esi, esp ; sizeof( int * __ptr64 ) yields 8 push 8 ; call malloc call DWORD PTR __imp__malloc add esp, 4 cmp esi, esp call __RTC_CheckEsp ; convert the 32-bit pointer to a 64-bit value cdq ; store both parts of the 64-bit value into a 64-bit variable mov DWORD PTR _p$[ebp], eax mov DWORD PTR _p$[ebp+4], edx ; push both halves of the pointer onto the stack mov eax, DWORD PTR _p$[ebp+4] push eax mov ecx, DWORD PTR _p$[ebp] push ecx ; call func and then cleanup the 8 bytes on the stack when done call ?func@@YAXPEAH@Z add esp, 8 ; this is the implementation of func itself ; notice that it only uses the low 32-bits of the pointer ; when dereferencing and assigning in the value 12 mov eax, DWORD PTR _p$[ebp] mov DWORD PTR [eax], 12
So while you are asking for a 64-bit pointer, and actually receive a 64-bit value in response, you really can only use 32-bits of it which is what the MSDN documentation is alluding to.
Considering that __ptr32 and __ptr64 yield types of different sizes, you might expect the types to differ such that you can overload on them. For instance, you can overload based on long and long long types. Additionally, if you view them as qualifiers (like const or volatile), you can also overload. So what about __ptr32 and __ptr64? It turns out that the compiler treats these qualifiers as sugar instead of as part of the canonical type, so you cannot overload based on them.
void f( int * __ptr64 i ) {} void f( int * __ptr32 i ) {} void f( int * i ) {}
This will yield two errors stating:
function 'void f(int * __ptr64)' already has a body
Okay, so the types aren’t different canonically as far as function overloading is concerned. Yet they are considered different types as far as conversions go:
int * __ptr64 p1 = 0; int * __ptr32 p2 = p1;
Compiling this code yields the warning:
warning C4244: 'initializing' : conversion from 'int * __ptr64' to 'int * __ptr32__ptr32 ', possible loss of data
As if this isn’t confusing enough, the name mangler implies the types are different!
void f( int * __ptr64 i ) {} void f2( int * __ptr32 i ) {}
PUBLIC ?f@@YAXPEAH@Z ; f PUBLIC ?f2@@YAXPAH@Z ; f2
So as far as the name mangler is concerned, they are different types!
__sptr and __uptr fit into the equation as pointer type qualifiers that affect how pointers are extended. If you specify __sptr, the pointer will be sign extended. If you specify __uptr, it will be zero extended. Modifying our x86 example slightly:
void func( int * __ptr64 p ) { *p = 12; } int main( void ) { int * __ptr32 __uptr p = (int * __ptr32 __uptr)::malloc( sizeof( int * __ptr32 __uptr ) ); func( p ); return 0; }
We’ve added the __uptr qualifier to the variable p in main. This means that when p is converted to a 64-bit pointer as an argument to func, it will be zero extended instead of sign extended. Looking at the disassembly shows this to be true:
; Allocate a 32-bit pointer by calling malloc push 4 call DWORD PTR __imp__malloc add esp, 4 ; Store the 32-bit value into p mov DWORD PTR _p$[ebp], eax ; Convert p into a 64-bit pointer by zero extending it ; when passing it to func mov eax, DWORD PTR _p$[ebp] xor ecx, ecx push ecx push eax ; Call func, and then clean up the 8 bytes on the stack call ?func@@YAXPEAH@Z ; func add esp, 8
The semantics for __sptr and __uptr are a bit more consistent than __ptr32 and __ptr64 as they don’t affect warning semantics, overload semantics, or the name mangler. However, that leaves them with their own peculiar quirks.
int * __sptr p1 = 0; int * __uptr p2 = p1;
This code will compile without warning, because as far as the compiler is concerned, __sptr and __uptr only affect code generation, and not semantics. Additionally, template specialization treats them as equivalent, so when a template is instantiated, it gets whatever pointer qualifiers are on the first instantiation. For instance)
template<typename T> void f(void **p, T *q) { *p = *q; } void *g(int *__ptr32 __sptr a) { void *result; f(&result, &a); return result; } void *h(int *__ptr32 __uptr a) { void *result; f(&result, &a); return result; } void *i(char * __ptr32 __uptr a ) { void *result; f(&result, &a); return result; } int main() { printf("%p\n", g((int *__ptr32 __sptr)0xdeadbeef)); printf("%p\n", h((int *__ptr32 __uptr)0xdeadbeef)); printf("%p\n", i((char *__ptr32 __uptr)0xdeadbeef)); }
If you run this code, you will see the first two pointers being sign extended, and the last one being zero extended. This happens because the template instantiation is pulling in the __sptr from the first call, not generating a new instantiation for the second call, but doing so for the third call.
Consequently, you can’t really trust the MSDN documentation on __sptr and __uptr when it comes to what’s allowed and what isn’t allowed. I’ve already filed these Connect reports on the topic:
Apply __sptr and __uptr to pointer to members
Specify __sptr and __uptr on the same type
Allowed to specify __sptr or __uptr on a non-pointer type
All this being said, __ptr32, __ptr64, __sptr and __uptr aren’t things you are likely to ever use unless you happen to be doing some highly specific work. But hopefully this sheds a bit of light on the weird behavior of these quirky features!
Ah the stories I could tell about how accurate/complete/unfabulous the MSDN documentation is. ;)
Thanks for the article, Aaron. :D
Pingback: CLion 2017.3 EAP: MSVC extensions, multiple compilers on one project and more | CLion Blog