Returning Stack-Based Values

QString GetSomeString( void )
{
	return QString( "foobar" );
}

The code looks innocuous enough, but something as simple as this can be the source of hard to track down bugs. I want to talk a bit about the dangers of returning stack-based values.

The above example is a distilled instance of a bug that I found while doing a code review on some source for my day job. While there’s nothing wrong with the code above, the signature of the method makes it trivial for people to abuse. Eventually, that code was being used like this:

// void SomeFunction( const char *str );
const char *cstr = GetSomeString().toStdString().c_str();
SomeFunction( cstr );

Can you spot the bug? The problem is one of scope lifetime. GetSomeString returns a stack-based temporary rvalue, and then toStdString is called on that value. This, in turn, creates a stack-based temporary rvalue and then c_str is called on that value. This gets a pointer to the data held within the std::string — except that pointer is only valid for the lifetime of the temporary rvalue! So once the call to c_str returns and the expression finishes executing, the std::string::~string destructor is called, as is the QString::~QString destructor. So cstr is pointing to free’d memory, and then passed into SomeFunction.

When designing method signatures, you should keep in mind all of the ways the function may be used. This sort of anticipation can be difficult at first, but I’ve found that doing a bit of test-driven development allows you to experiment with all the crazy ways people might call your function. In this case, it would help to notice that users can call GetSomeString and apply methods directly on the return value. This brings up an issue with temporary data lifetimes. The same is true of the QString::toStdString method, of course.

The way I would solve this problem is by forcing the user to “own” the memory in a more explicit way, instead of relying on temporary rvalues. Pass by reference semantics allows you to have output parameters to accomplish this.

void GetSomeString( QString& outStr )
{
	outStr = QString( "Foobar" );
}

This forces the caller to become:

QString q;
GetSomeString( q );
const char *cstr = q.toStdString().c_str();

Unfortunately, it doesn’t solve the problem with the QString method signature! There’s really nothing you can do in this case, because that’s Qt’s issue.

tl;dr: be careful about stack-based value lifetime. If there’s heap memory being allocated and exposed, you’re engaging in risky behavior.

This entry was posted in Framework Design and tagged , . Bookmark the permalink.

4 Responses to Returning Stack-Based Values

  1. Hello Aaron,

    Thanks for your articles: I enjoy them a lot!

    I have a question of your most recent article about returning stack-based values. In it you argued that the Qt library was more-or-less the source of the error. This made me wonder: does the STL not have this problem? Because returning a std::string is IMHO very common….

    Thank again for your work, Richel Bilderbeek

  2. Aaron Ballman says:

    I’m glad to hear you like the blog postings! :-)

    I don’t believe the STL has this issue per-se, because very few things return a stack-based class item (I can’t think of any, actually). Everything generally returns a reference to an item stored at the class level itself. So, for instance, a std::vector< std::string > has defined operator[], which certainly *could* suffer from this issue. But if you look at the declaration for the operator, you’ll see that it returns a reference to the string (a std::string& or a const std::string&), which means there’s a clear ownership of the data (the vector still owns it).

    So, in this case, when you use myVector[ 0 ].c_str(), the vector retains ownership of the std::string, which means that the underlying const char * from c_str will remain valid for at the lifetime of the vector (which, likely isn’t going to be temporary).

    Does this make sense?

    You *could* contrive a case where this falls apart with the new C++0x initializer lists, where you gin up a temporary std::vector and use operator[] all in the same expression… but that’s not really a framework design problem since the goal would be to write the worst code possible in that case. ;-)

  3. hlide says:

    Are you sure about that ?

    // void SomeFunction( const char *str );
    const char *cstr = GetSomeString().toStdString().c_str();
    SomeFunction( cstr );

    while I want to agree with your warning about such usage and discourage that usage strongly, I noticed (at least with vs2008) that temporary values constructed by a function were destructed at the end of a scope “{}” making this example still working. I don’t know whether this behavior is a C++ rule or specific to vs2008.

  4. Aaron Ballman says:

    hlide: yes, I am sure about this. Here is a simple example that demonstrates the basic underlying problem:

    #include

    class Temp {
    public:
    ~Temp() { std::cout << "Temp destroyed" << std::endl; } void *dangle() const { return nullptr; } }; class Outer { public: Temp get() const { return Temp(); } }; Outer get_outer() { return Outer(); } int main() { std::cout << "1" << std::endl; void *dangling = get_outer().get().dangle(); std::cout << "2" << std::endl; return 0; }

    In MSVC 2013, this prints:

    1
    Temp destroyed
    Temp destroyed
    2
    

    With gcc 4.8.1, this prints:

    1
    Temp destroyed
    2
    

    (I believe the difference is because gcc implicitly generates move constructors and assignment operators, whereas MSVC does not. It's immaterial to the underlying problem though.)

    Temporaries are destroyed at the end of the full expression, as per [class.temporary]p3. Specifically,

    Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created.

Leave a Reply

Your email address will not be published. Required fields are marked *