The Subtle Dangers of Temporaries in for Loops
Even though very convenient to write concise code, temporaries are an endless source of bugs in C++.
Are we allowed to use a temporary in a range based for loop? Consider the following code:
std::vector<int> create_range() { return {1, 2, 3, 4, 5}; } int main() { for (auto const& value : create_range()) { std::cout << value << ' '; } }
Is the temporary object returned by create_range
kept alive during the for loop?
The answer is yes, and the following code prints this:
1 2 3 4 5
But if we make anything more on the temporary, even something as simple as returning a reference to it:
std::vector<int> create_range() { return {1, 2, 3, 4, 5}; } std::vector<int> const& f(std::vector<int> const& v) { return v; } int main() { for (auto const& value : f(create_range())) { std::cout << value << ' '; } }
Then the code falls into undefined behaviour. On a certain implementation, the output is this:
0 0 3 4 5
This is surprising. Indeed, temporaries are usually destroyed on the end of a statement, so how we transform them on the line of code should not influence the moment they’re destroyed.
To understand what we can do, what we’re not allowed to do with temporaries in for loops in C++ and how to fix the last case, let’s understand what’s going on in both those pieces of code.
The code of a range based for loop
When we write the nice looking range based for loop, the compiler expands in into several lines of less nice looking code.
For example, the following loop:
for(auto const& value : myRange) { // code using value }
…gets expanded into this:
{ auto&& range = myRange; auto begin = begin(range); auto end = end(range); for ( ; begin != end; ++begin) { auto const& value = *begin; // code using value } }
For all the details about this expansion, check out the [stmt.ranged] section in the C++ standard (which you can download on this page).
Let’s now understand how this code supports temporary objects.
Using temporary objects
Let’s go back to our initial example using temporaries:
std::vector<int> create_range() { return {1, 2, 3, 4, 5}; } int main() { for (auto const& value : create_range()) { std::cout << value << ' '; } }
Here is what the expanded for loop looks like in this case:
{ auto&& range = create_range(); auto begin = begin(range); auto end = end(range); for ( ; begin != end; ++begin) { auto const& value = *begin; // code using value } }
As we can see, the temporary is not created on the line of the for
, unlike what the syntax of the ranged based for loop could have been suggesting. This already hints that the mechanisms handling temporaries in for loops are more complex than meets the eye.
How can the above code work? What prevents the temporary from being destroyed at the end of the statement it is created on, on line 2 in the above code?
This is one of the properties of auto&&
. Like const&
, a reference declared with auto&&
keeps a temporary object alive until that reference itself gets out of scope. This is why the temporary object returned by create_range()
is still alive and valid when reaching the statements using its values inside of the for loop.
Transformations of temporary objects
Now let’s go back to the initial example that was undefined behaviour:
std::vector<int> create_range() { return {1, 2, 3, 4, 5}; } std::vector<int> const& f(std::vector<int> const& v) { return v; } int main() { for (auto const& value : f(create_range())) { std::cout << value << ' '; } }
Let’s expand the loop again:
{ auto&& range = f(create_range()); auto begin = begin(range); auto end = end(range); for ( ; begin != end; ++begin) { auto const& value = *begin; // code using value } }
Can you see what’s wrong with this code now?
Unlike in the previous case, auto&&
doesn’t bind on the expression create_range()
. It binds on the reference to that object returned by f
. And that is not enough to keep the temporary object alive.
It is interesting to note that range
is declared with an auto&&
binding to a const&
which is defined (in the implementation of f
) to be equal to a const&
on the temporary. So we have a chain of auto&&
and const&
which, individually, can keep a temporary alive. But if we don’t have a simple expression with one of them biding directly on the temporary, they do not keep it alive.
How to fix the code
If you have to use f
to make a transformation on your temporary, then you can store the result of this transformation in a separate object, defined on a separate line:
auto transformedRange = f(create_range()); for (auto const& value : transformedRange) { std::cout << value << ' '; }
This is less nice because it adds code without adding meaning, and it generates a copy of the transformed range. But in the case of a transformation, f can return by value, which can enable return value optimisations or move semantics if the type is moveable. But still, the code gets less concise.
The case of member functions
So far, all our examples were using free functions. But the problem is the same with member functions called on the temporary. To illustrate, consider the following class:
class X { public: explicit X(std::string s) : s_(s){} std::string const& getString() { return s_; } private: std::string s_; };
This function instantiates an X
and returns a temporary object:
X createX() { return X{"hello"}; }
This ranged based for loop uses a reference pointing into a destroyed temporary and has therefore undefined behaviour:
for (auto const& x : createX().getString()) { std::cout << x << ' '; }
Like for free functions, we can declare the object on a separate statement. But, as suggested in this SO question, member functions have another way to fix this code, if we can modify the implementation of X:
class X { public: explicit X(std::string s) : s_(s){} std::string const& getString() & { return s_; } std::string getString() && { return std::move(s_); } private: std::string s_; };
Note the trailing &
and &&
after the prototypes of getString
. The first one gets called on an lvalue, and the second one on a rvalue. createX()
is an rvalue, so createX().getString()
calls the second overload.
This second overload itself returns a temporary object. This allows the auto&&
in the expansion of the ranged base for loop to keep it alive, even if the object returned by createX()
dies:
{ auto&& range = createX().getString(); auto begin = begin(range); auto end = end(range); for ( ; begin != end; ++begin) { auto const& value = *begin; // code using value } }
The following code then becomes correct:
for (auto const& x : createX().getString()) { std::cout << x << ' '; }
Temporaries are an endless source of bugs fun, right?
You will also like
- Understanding lvalues, rvalues and their references
- A Pipe Operator for the Pipes Library?
- An Alternative Design to Iterators and Ranges, Using std::optional
- Return value optimizations
Share this post!