Variable argument lists are very arcane in the world of C. You’ll see them expressed in function signatures as … at the end of the parameter list, but you may not understand how they work or what they do.
There are two sides to every function at runtime: the caller and the callee, and they both need to agree on the semantics of how the function call works (how do parameters get passed, return values returned, etc). The caller’s side is pretty easy with variable arguments because the compiler will continue to pass parameters like it normally would, right up until the terminating parenthesis. For the caller, it’s no different than any other function call except for the fact that the compiler will allow any number of arguments (including zero) after the initial parameters. They all get pushed onto the stack (depending on architecture and calling conventions) in the proper order, and the callee is responsible for figuring them out.
The callee side, on the other hand, is quite interesting. Let’s take a look at a typical function:
void foo( int bar, ... ) { va_list args; va_start( args, bar ); for (;;) { some_type t = va_arg( args, some_type ); /* Do something with t */ } va_end( args ); }
The way it works on the callee side is that you must use the “va” functions: va_start, va_arg and va_end as a way to query for the parameter list. The first thing you’ll notice is the only way to call va_start is by passing in the last parameter in the list (the parameter preceding the …). This is because the only way for variable argument lists to work is for the callee to know where the function parameters live, and you can only do that if you have at least one formal parameter. Because va_start needs the exact location of the parameter, it is a macro instead of a function — this allows it to get at the address of the parameter on the correct stack. Once you’ve gotten the start of the va_list, you get each individual argument by using the va_arg macro. This macro takes the list as well as a type, translates into the actual argument value, and advances the list to the next parameter. va_end is called when completed to clean up. There’s no compiler magic required for this implementation, everything can be done with vanilla C functionality.
This has some interesting ramifications. For instance, you must know where to stop looking for arguments or else you will walk off the end of the argument list. Also, this means that the function taking a variable argument list must be declared as __cdecl because the caller is the only one who knows how to properly clean up the call stack (the callee doesn’t know how many parameter were passed to it, remember?). Finally, in order to walk the list, you must know exactly what types are expected in what order. If you don’t, then when va_arg attempts to read in a parameter, it may give you garbage by reading too much or too little memory. And when it advances to the next argument, it may do the same because the previous argument’s size information was wrong.
To better understand what I am talking about, this is a hint as to how these macros might be implemented on x86:
typedef unsigned char *va_list; #define va_start(list, param) (list = (((va_list)¶m) + sizeof(param))) #define va_arg(list, type) (*(type *)((list += sizeof(type)) - sizeof(type)))
They may look dense, but they’re rather simplistic in their implementation. The va_list is nothing more than a byte pointer. va_start assigns the address of the first parameter *after* the one passed in. va_arg advances the byte pointer by the parameter size and stores it, then backs up a bit to grab the actual value.
So variable argument lists aren’t as powerful or easy to use as they are in other languages like C# or Python. They basically are at the mercy of the programmer. But now you probably understand why you have to put all the type information into a printf format specification, or why you need to specify sentinel values to terminate lists.
Thanks for this Aaron. This is one of those things that always tickled the back of my brain, and now I have a much better understanding of it.
:D
Nice article, Thanks :)
Thank you. I’ve always wondered how the variable notation … works. I never knew it was a macro!
This wont work because your incrementing pointer(char*) by the size_t before typecasting. In this list(say it has address 3000, args are “Hello, world”, 23, 10.44, ‘A’) is “char*” & in you increment it by sizeof(int) then it will be
list+=4 ==> list = list+4 ==> 3000+4 = 3004
Is it the address of next variable ? Again “Hello, world” is stored outside the program & it
will not be on the stack frame, then how could it will get next 23’s address ?
also why you added “sizeof(type)” & then substracted again the same from “list”.
It has no effect. Aslo there is 1 extra “(” before “(*(type *)((list += sizeof( type )) – sizeof( type ))”.( the first one)
@Vikas — You are correct, there are some typos in there. For starters, va_list should really be an unsigned char (to avoid possible undefined behavior, depending on the implementation’s representation of char). The typecast is in the wrong location for va_start. And va_arg is definitely missing a right paren. That’s a whole lot of typos, which I’ve corrected; thank you for pointing them out!
As for va_arg, that implementation is correct (sans typo). The list variable points to the current item to obtain (so that it can compare equal to va_end eventually), which means va_arg needs to advance list. So the list variable needs to be updated to the next argument, which is why += sizeof(type), but the value of the current argument needs to be taken, which is why we subtract sizeof(type) before dereferencing.
But keep in mind that this is all totally expository. Don’t do this in real code. Ever. The standard library usually has a far more efficient implementation for a given architecture.
For CPUs on which arguments are passed via CPU’s registers to the callee, like PowerPC, MIPS, how does va_list work?
@xiaokaoy — it depends on the architecture’s ABI. From what I can tell taking a quick look at how we do it in LLVM, register-based architectures either use the underlying stack to pass the arguments (like PPC32), or it passes the arguments in registers, but uses the stack to implement va_start and va_arg as compiler intrinsics.
More on MIPS at http://math-atlas.sourceforge.net/devel/assembly/mipsabi32.pdf (See 3-46)
More on PPC at http://math-atlas.sourceforge.net/devel/assembly/elfspec_ppc.pdf (See 6-6)
Thank you very much for this article, and I have a question:
I didn’t understand exactly, why it is better to implement the va_start and the other “va”s as a macro instead of a function ?
@melika — the quick and dirty answer is: the C standard requires them to be implemented as macros. See 7.16. The slightly more involved answer is: because they cannot be implemented any other way. va_start has to be a macro because you need to pass in the parameter just before the ellipsis — if va_start were a function instead of a macro, then that parameter’s value would be *copied* to the va_start function call instead of used directly by the macro. That parameter can only make sense if it is used directly because it specifies “this is where the extra arguments start”, so its location in memory is what is important. va_arg is similarly constrained in that you specify a type as the second argument, and there’s no way to do that for a function call expression.
The hints given for the implementation on x86 are very helpful in understanding the real thing !
Thank You so much Aaron :)
Hi Aarron,
Maybe you can help with the following.
I am using my own bare-meta code with cortex A9.
Recently I started using the same package with cortex A8.
But I noticed that calling a function with varibale number of arguments fails somewhere inside (not sure where yet- no printing from the OS).
Int32 BSP_DbgPrintf(CONST CHAR *fmt, …)
{
int int_ret_val;
Int32 Channel = BSP_DEBUG;
uint32_t ret_val = 0;
va_list args;
va_start(args, fmt);
int_ret_val = vsprintf(print_buf, fmt, args);
if (int_ret_val >= 0)
{ BSP_Print(Channel,print_buf);
ret_val = 0;
} else
{ ret_val = -1;
}
va_end(args);
return ret_val;
}
The crosss compiler is:
arm-eabi-gcc.exe (GCC) 4.8.2
I would like to ask if you have some idea what can cause this function not to work properly ? Is it related to startup code (configuring stack pointer) ?
Thanks,
Ranchu