How to Store an lvalue or an rvalue in the Same Object
There seems to be a problem coming up every so often C++ code: how can an object keep track of a value, given that this value can come from either an lvalue or an rvalue?
In short, if we keep the value as a reference then we can’t bind to temporary objects. And if we keep it as a value, we incur unnecessary copies when it is initialized from an lvalue.
What’s a C++ programmer to do?
There are several ways to cope with this situation. I find that using std::variant
offers a good trade-off to have expressive code.
Keeping track of a value
Here is a more detailed explanation of the problem.
Consider a class MyClass
. We would like to give MyClass
access to a certain std::string
. How do we represent the string inside of MyClass
?
We have two options:
- storing it as a reference,
- storing it as a value.
Storing a reference
If we store it as a reference, for example a const reference:
class MyClass { public: explicit MyClass(std::string const& s) : s_(s) {} void print() const { std::cout << s_ << '\n'; } private: std::string const& s_; };
Then we can initialize our reference with an lvalue:
std::string s = "hello"; MyClass myObject{s}; myObject.print();
This code prints out:
hello
All good. But what if we want to initialize our object with an rvalue? For example with this code:
MyClass myObject{std::string{"hello"}}; myObject.print();
Or with this code:
std::string getString(); // function declaration returning by value MyClass myObject{getString()}; myObject.print();
Then the code has undefined behaviour. Indeed, the temporary string object is destroyed on the same statement it is created. When we call print
, the string has already been destroyed and using it is illegal and leads to undefined behaviour.
Really?
To illustrate this, if we replace std::string
with a type X
where we log in the destructor:
struct X { ~X() { std::cout << "X destroyed" << '\n';} }; class MyClass { public: explicit MyClass(X const& x) : x_(x) {} void print() const { // using x_; } private: X const& x_; };
Let’s also add logging to the call site:
MyClass myObject(X{}); std::cout << "before print" << '\n'; myObject.print();
This program then prints (live code here):
X destroyed before print
We can see that the object is destroyed before we attempt to use it.
Storing a value
The other option we have is to store a value. This allows us to use move semantics to move the incoming temporary into the stored value:
class MyClass { public: explicit MyClass(std::string s) : s_(std::move(s)) {} void print() const { std::cout << s_ << '\n'; } private: std::string s_; };
Now with this call site:
MyClass myObject{std::string{"hello"}}; myObject.print();
We incur two moves (one to construct s
and one to construct s_
) and we don’t have undefined behaviour. Indeed, even if the temporary is destroyed, print
uses the instance inside of the class.
Unfortunately, if we go back to our first call site, with an lvalue:
std::string s = "hello"; MyClass myObject{s}; myObject.print();
Then we’re no longer making two moves: we’re making one copy (to construct s
) and one move (to construct s_
).
What’s more, our purpose was to give MyClass
access to the string, and if we make a copy we have a different instance than the one that came in. So they won’t be in sync.
With the temporary object it wasn’t a problem because it was to be destroyed anyway and we moved it in just before, so we still had access to “that” string. But by making a copy we no longer give MyClass
access to the incoming string.
So using a value is not a good solution either.
Storing a variant
Storing a reference is not a good solution, and storing a value is not a good solution either. What we would like to do is to store a reference if the value is initialised from an lvalue, and store a value if it is stored from an rvalue.
But a data member can only be of one type: value or reference, right?
Well, with a std::variant
, it can be either one.
However, if we try to store a reference in a variant, like this:
std::variant<std::string, std::string const&>
We get an compilation error expressed with a broken static assert:
variant must have no reference alternative
To achieve our purpose we need to put our reference inside of another type.
This means that we have to write specific code to handle our data member. If we write such code for std::string
we won’t be able to use it for another type.
At this point it would be good to write the code in a generic way.
A generic storage class
The storage of our motivating case needed to be either a value or a reference. Since we’re writing this code for a general purpose now, we may as well allow non-const references too.
Since the variant cannot hold references directly, let’s store them into wrappers:
template<typename T> struct NonConstReference { T& value_; explicit NonConstReference(T& value) : value_(value){}; }; template<typename T> struct ConstReference { T const& value_; explicit ConstReference(T const& value) : value_(value){}; }; template<typename T> struct Value { T value_; explicit Value(T&& value) : value_(std::move(value)) {} };
And let’s define our storage to be either one of those cases:
template<typename T> using Storage = std::variant<Value<T>, ConstReference<T>, NonConstReference<T>>;
Now we need to give access to the underlying value of our variant, by providing a reference. We create two types of access: one const and one not const.
Defining const access
To define const access, we need to make each of the three possible type inside of the variant produce a const reference.
To access data inside the variant, we’ll use std::visit
and the canonical overload
pattern, which can be implemented in C++17 the following way:
template<typename... Functions> struct overload : Functions... { using Functions::operator()...; overload(Functions... functions) : Functions(functions)... {} };
To get our const reference, we can just create one for each case of the variant:
template<typename T> T const& getConstReference(Storage<T> const& storage) { return std::visit( overload( [](Value<T> const& value) -> T const& { return value.value_; }, [](NonConstReference<T> const& value) -> T const& { return value.value_; }, [](ConstReference<T> const& value) -> T const& { return value.value_; } ), storage ); }
Defining non-const access
The creation of a non const reference uses the same technique, except that if is variant is a ConstReference
, it can’t produce a non-const reference. However, when we std::visit
a variant, we have to write code for each of its possible types:
template<typename T> T& getReference(Storage<T>& storage) { return std::visit( overload( [](Value<T>& value) -> T& { return value.value_; }, [](NonConstReference<T>& value) -> T& { return value.value_; }, [](ConstReference<T>& ) -> T&. { /* code handling the error! */ } ), storage ); }
We should never end up in that situation, but we still have to write some code for it. The first idea that comes to (my) mind is to throw an exception:
struct NonConstReferenceFromReference : public std::runtime_error { explicit NonConstReferenceFromReference(std::string const& what) : std::runtime_error{what} {} }; template<typename T> T& getReference(Storage<T>& storage) { return std::visit( overload( [](Value<T>& value) -> T& { return value.value_; }, [](NonConstReference<T>& value) -> T& { return value.value_; }, [](ConstReference<T>& ) -> T& { throw NonConstReferenceFromReference{"Cannot get a non const reference from a const reference"} ; } ), storage ); }
If you have other suggestions, I’d love to hear them!
Creating the storage
Now that we have defined our storage class, let’s use it in our motivating case to give access to the incoming std::string
regardless of its value category:
class MyClass { public: explicit MyClass(std::string& value) : storage_(NonConstReference(value)){} explicit MyClass(std::string const& value) : storage_(ConstReference(value)){} explicit MyClass(std::string&& value) : storage_(Value(std::move(value))){} void print() const { std::cout << getConstReference(storage_) << '\n'; } private: Storage<std::string> storage_; };
Consider the first call site, with an lvalue:
std::string s = "hello"; MyClass myObject{s}; myObject.print();
It matches the first constructor, and creates a NonConstReference
inside of the storage member. The non-const reference is converted into a const reference when the print
function calls getConstReference
.
Now consider the second call site, with the temporary value:
MyClass myObject{std::string{"hello"}}; myObject.print();
This one matches the third constructor, and moves the value inside of the storage. getConstReference
then returns a const reference to that value to the print
function.
The evolution of the standard library
std::variant
offers a very adapted solution to the classical problem of keeping track of either an lvalue or an rvalue in C++.
The code of this technique is expressive because std::variant
allows to express something that is very close to our intention: “depending on the context, the object could be either this or that”. In our case, “this” and “that” are a “reference” or a “value”.
Before C++17 and std::variant
, solving that problem was tricky and led to code that was difficult to write correctly. With the language evolving, the standard library gets more powerful and lets us express our intentions with more and more expressive code.
We will see other ways in which the evolution of the standard library helps us write more expressive code in a future article. Stay tuned!