A Recap on string_view
The string capabilities of C++ have little evolved since C++98, until C++17 brought a major evolution: std::string_view
.
Let’s look at what string_view
is about and what it can bring to your code, by making it more expressive and making it run faster.
std::string_view
As its name suggests, std::string_view
is a view on a string. But let’s define view and let’s define string.
A view…
A view is a light object that can be constructed, copied, moved and assigned to in constant time, and that references another object.
We can draw a parallel with C++20’s range views that model the concept std::ranges::view
. This concept required that view
s can be copied, moved and assigned in constant time, and views typically reference other ranges.
C++17 didn’t have concepts and ranges, but std::string_view
already had the semantics of a view. Note that std::string_view
is a read-only view. It cannot modify the characters in the string that it references.
Also, note that you don’t have to wait for C++17 to use string_view
. There are some C++11 compliant implementations, such as the one of Abseil for example.
… on a string
A view references something, and here std::string_view
references a string. This “string” denomination includes three things:
- a
std::string
, - a null-terminated
char*
, - a
char*
and a size.
These are the three inputs you can pass in to build a string. The first one is defined in the std::string
class as a implicit conversion operator, and the last two correspond to std::string_view
‘s constructors.
In summary, std::string_view
is a lightweight object that reference a C or C++ string. Now let’s see how that can be useful to your code.
A rich API for cheap
Let’s go back to the history of strings in C++.
The roots of std::string
Before C++, in C, there was no string
class. C forced us to carry around char*
pointers, which has two drawbacks:
- there is no clear ownership of the array of characters,
- the API to operate on them is very limited.
As Scott Meyers mentions towards the end of More Effective C++, when building the C++ language, “As Chair of the working group for the C++ standard library, Mike Vilot was told: ‘If there isn’t a standard string
type, there will be blood in the streets!'”. And C++ had the std::string
class.
std::string
solves the above two problems of char*
, as std::string
owns its characters and deals with the associated memory, and it has a very rich interface, that can do many, many things (it is so big that Herb Sutter describes its “monolith” aspect in the last 4 chapters of Exceptional C++).
The price of ownership
Ownership and memory management of the array of characters is a big advantage, that we can’t imagine how we’d live without today. But it comes with a price: each time we construct a string, it has to allocate memory on the heap (assuming it has too many characters to fit in the small string optimisation). And each time we destruct it, it has to hand back this heap memory.
These operations involve the OS and take time. Most of the time they go unnoticed though, because most code is statistically not critical for performance. But in the code that happens to be performance sensitive (and only your profiler can tell you what code this is), repeatedly building and destructing std::string
can be unacceptable for performance.
Consider the following example to illustrate. Imagine we’re building a logging API, that uses std::string
because it’s the most natural thing to does it makes the implementation expressive by taking advantage of its rich API. It wouldn’t even cross our minds to use char*
:
void log(std::string const& information);
We make sure to take the string by reference to const
, so as to avoid copies that would take time.
Now we’re calling our API:
log("The system is currently computing the results...");
Note that we’re passing a const char*
, and not a std::string
. But log
expects a std::string
. This code compiles, because const char*
is implicitly convertible to std::string
… but despite the const&
, this code constructs and destructs a std::string
!
Indeed, the std::string
is a temporary object built for the purpose of the log
function, and is destructed at the end of the statement calling the function.
char*
can come from string literals as in the above example, but also from legacy code that doesn’t use std::string
.
If this is happening in a performance sensitive part of the codebase, it may be too big of a performance hit.
What to do then? Before string_view
, we had to go back to char*
and forgo the expressiveness of the implementation of log
:
void log(const char* information); // crying emoji
Using std::string_view
With std::string_view
we can get the best of both worlds:
void log(std::string_view information);
This does not construct a std::string
, but merely a lightweight view over the const char*
. So no more performance impact. But we still get all the nice things of std::string
‘s API in order to write expressive code in the implementation of log
.
Note that we pass string_view
by copy, as it has the semantics of a reference.
Pitfall: memory management
Since a std::string_view
references a string and doesn’t own it, we have to make sure that the referenced string outlives the string_view
. In the above code it looked OK, but if we’re not careful we could get into memory issues.
For example consider this code, simplified for illustration purposes:
std::string_view getName() { auto const name = std::string{"Arthur"}; return name; }
This leads to undefined behaviour: the function returns a std::string_view
pointing to a std::string
that has been destroyed at the end of the function.
This issue is not new and specific to std::string_view
. They exist with pointers, references, and in the general sense with any object that references another one:
int& getValue() { int const value = 42; return value; } // value is destructed!
More and more views in C++
As mentioned earlier, C++20 introduces the formal concept of view
for ranges, and brings in a lot more views into the standard. These include transform
, filter
and the other range adaptors, which are some of the selling arguments of the ranges library.
Like string_view
, they are lightweight objects with a rich interface, that allow to write expressive code and pay for little more than what you use.
You will also like
- How to *Efficiently* Convert a String to an int in C++
- The Complete Guide to Building Strings In C++: From “Hello World” Up To Boost Karma
- How to split a string in C++
- Introduction to the C++ Ranges Library
Share this post!