Performance benchmark: Ranges VS STL algorithms VS Smart output iterators
Ranges, STL algorithms and smart output iterators are three libraries that perform operations on collections and make code more expressive.
Even if they have some specificities, like zip
for ranges and unzip
for smart output iterators for example, as we saw when combining ranges with output iterators, they also share features in common, such as transform
and filter
.
On those shared features, which library is the fastest in terms of execution time? Ranges, STL algorithms or smart output iterators?
The accurate answer is “it depends on your exact test case, measure on your code and on your platform”, but the accurate answer is a tad terse, isn’t it. We’ll go for a ballpark answer, to get a feeling if one of them seems to be much faster or slower than the others, or if they seem to be in the same ballpark.
As we’ll see (spoiler alert!) it turns out that on our tested used cases, ranges and smart output iterators are in the same ballpark.
transform
Let’s start with a simple test case: applying a function to each element of the input collection. The component to do that has the same name for all three libraries: transform
.
We take a vector of int
s called numbers
, and apply the function times2
to each of its elements:
int times2(int x) { return x * 2; }
For ranges, our tested code is this:
ranges::push_back(results, numbers | ranges::view::transform(times2));
For STL algorithms, our tested code is this:
std::transform(begin(numbers), end(numbers), back_inserter(results), times2);
For smart output iterators, our tested code is this:
numbers >>= fluent::to_output >>= fluent::output::transform(times2) >>= back_inserter(results);
To run our benchmarks we use Fred Tingaud’s popular Quick-Bench.com.
Here are the results for clang with various levels of optimization flags:
And for gcc:
Here is the benchmark, for reference.
Those results show that, on this use case, ranges and smart output iterators tend to be in the same ballpark, and with clang the STL algorithm seems to have an edge over both of them.
filter
then transform
Let’s try a more elaborate case, by chaining two operations, filter
then transform
.
For this we introduce a predicate to filter on:
bool isEven(int x) { return x % 2 == 0; }
For ranges, our tested code is this:
ranges::push_back(results, numbers | ranges::view::filter(isEven) | ranges::view::transform(times2));
For STL algorithms, our tested code is this:
std::copy_if(begin(numbers), end(numbers), back_inserter(filteredNumbers), isEven); std::transform(begin(filteredNumbers), end(filteredNumbers), back_inserter(results), times2); }
For smart output iterators, our tested code is this:
numbers >>= fluent::to_output >>= fluent::output::filter(isEven) >>= fluent::output::transform(times2) >>= back_inserter(results);
Here are the results for clang:
And for gcc:
This gives consistent observations with the previous use case with transform
only.
Here is the complete code for this benchmark.
transform
then filter
Finally, let’s swap filter
and transform
in order to apply transform
first and filter
after it.
We have to change our predicate because all numbers that has been multiplied by 2 is even. So we take the following predicate:
bool isMultiple4(int x) { return x % 4 == 0; }
For ranges, our tested code is this:
ranges::push_back(results, numbers | ranges::view::transform(times2) | ranges::view::filter(isMultiple4));
For STL algorithms, our tested code is this:
std::transform(begin(numbers), end(numbers), back_inserter(transformedNumbers), times2); std::copy_if(begin(transformedNumbers), end(transformedNumbers), back_inserter(results), isMultiple4);
For smart output iterators, our tested code is this:
numbers >>= fluent::to_output >>= fluent::output::transform(times2) >>= fluent::output::filter(isMultiple4) >>= back_inserter(results);
Here are the results for clang:
And for gcc:
This also gives consistent observations compared to the previous use cases.
Output iterators are in the ballpark
Those simple benchmarks suggest that smart output iterators can compare with ranges, in terms of performance. In some cases they went a bit faster, in some others a bit slower.
As always with performance, write the code with the best design possible, and if the application gets slow then identify the bottleneck(s) by running it through a profiler and act on those specifically.
This analysis was for the common features between both, such as transform
and filter
. That said, ranges and smart output iterators each have their specificities such as zip
and unzip
, that don’t exist in the other. In those cases, the choice between the libraries is already made.
You will also like
- The smart output iterators library
- Combining C++ ranges with smart output iterators
- How Smart Output Iterators Avoid the Terrible Problem Of Incrementing A Smart Iterator
- Smart Output Iterators: A Symmetrical Approach to Range Adaptors
Share this post!