A Minimal Interface: Both Expressive And Fast Code
Have you ever used std::inserter
to insert the outputs of an STL algorithm into a sorted container such as an std::set
?
And if you have, weren’t you annoyed by how its interface forces you to specify the position to insert the elements in the set?
I find this very annoying, because most of the time we have no idea where they should go inside the set at the time we write code. We don’t even know their values in advance. That’s the set
‘s job to figure out where to put the new elements and keep a sorted order.
So we end up sticking the begin
or the end
of the set as an argument to std::inserter
, and this useless piece of information sits like an uninvited guest in the middle of the elegant STL party:
std::vector<int> v = {1, 3, -4, 2, 7, 10, 8}; std::set<int> results; std::copy(begin(v), end(v), std::inserter(results, end(results)));
We’ve previously come across sorted_inserter
, that does the same thing as std::inserter
except you don’t have to specify where to insert the elements. You can specify it, if you happen to know, and it will save time to the set
instead of searching its location for you in this case. But otherwise the set
takes care of it (just like when we call its .insert
method):
std::vector<int> v = {1, 3, -4, 2, 7, 10, 8}; std::set<int> results; std::copy(begin(v), end(v), sorted_inserter(results));
By removing the call to the end iterator, sorted_inserter
makes for more direct code. But does this have an impact on performance? The point of this post is to compare the performance of sorted_inserter
with the standard std::inserter
.
For the sake of the example we’ll use std::copy
because it is the simplest STL algorithm, but sorted_inserter
could be used with other algorithms too. And as Reddit user FbF_ noted, in particular this doesn’t mean that we should use std::copy
to add data to a container, as there are better ways of inserting several elements into an STL container efficiently.
Measure, measure, measure… fine let’s do it!
For this benchmark I’ll use Fred Tingaud’s increasingly popular tool, Quick-Bench.
The test case we use here is this:
- construct a
vector<int>
containing 100 values randomly generated between -100 and +100, - copy the contents of this vector into a
set<int>
, by usingstd::copy
andstd::inserter(results, end(results))
- repeat 2) a large number of times, and measure the average time
- divide it by the time taken by an empty benchmark, in order to have a no-op reference
These are the results in blue below.
Maybe passing begin(results)
is better than end(results)
? I’ve thrown in a new test case (it’s very easy to do with quick-bench) to measure this. These are the results in pink below.
Finally, I’ve included a test case that uses sorted_inserter
instead of std::inserter
, represented by the results in yellow below.
Here are the visual results:
This results allow us to interpret two things:
- if you’re unsure what to put as a location in
std::inserter
,begin
andend
seem equivalent in terms of performance, sorted_inserter
is faster thanstd::inserter
. The above show a performance increase of 44%. This benchmark was done in O3 (for the other levels of optimisation the increase was closer to 20%).
Here is the quick-bench run for this test if you’d like to play around with it.
A minimal interface
Why does sorted_inserter
outperform the STL? It’s certainly not coming from a more efficient implementation, because the STL one is surely much better implemented.
I think the problem of std::inserter
is its interface: it’s doing too many things at the same time.
Indeed, it makes sense to specify a position for a vector
, because it can’t find it on its own. So std::inserter
‘s interface does make sense for vector. But it’s also trying to work for a set. It’s trying to fit all containers at the same time.
And std::inserter
is sending the set on the wrong track, by consistently providing a hint that is not the right one. That’s more work for the set than not giving a hint at all, because the set tries out the hint before realizing it was wrong, and then it still has to insert the element.
sorted_inserter
rather provides a minimal interface (just a container, no position), but it is specific to sorted containers, and doesn’t make sense on vectors. And it also provides the more elaborate interface that lets its user provide a hint, even if it is a less common use case.
I think a lesson to draw out of this analysis is that it is useful to provide at least one minimal interface, that perfectly satisfies the most basic need. Here this interface would consist in inserting into a sorted container without preliminary information about the final location of the inserted components. This is particularly important if this need occurs often, as it is the case withstd::inserter
on std::set
.
This way, we’ll have better chances of designing interfaces allowing both expressive and fast code.
Don't want to miss out ? Follow:   Share this post!