Performance benchmark: Ranges VS STL algorithms VS Smart output iterators

Published March 15, 2019

Ranges, STL algorithms and smart output iterators are three libraries that perform operations on collections and make code more expressive.

Even if they have some specificities, like zip for ranges and unzip for smart output iterators for example, as we saw when combining ranges with output iterators, they also share features in common, such as transform and filter.

On those shared features, which library is the fastest in terms of execution time? Ranges, STL algorithms or smart output iterators?

The accurate answer is “it depends on your exact test case, measure on your code and on your platform”, but the accurate answer is a tad terse, isn’t it. We’ll go for a ballpark answer, to get a feeling if one of them seems to be much faster or slower than the others, or if they seem to be in the same ballpark.

As we’ll see (spoiler alert!) it turns out that on our tested used cases, ranges and smart output iterators are in the same ballpark.

`transform`

Let’s start with a simple test case: applying a function to each element of the input collection. The component to do that has the same name for all three libraries: transform.

We take a vector of ints called numbers, and apply the function times2 to each of its elements:

int times2(int x)
{
    return x * 2;
}

For ranges, our tested code is this:

ranges::push_back(results, numbers | ranges::view::transform(times2));

For STL algorithms, our tested code is this:

std::transform(begin(numbers), end(numbers), back_inserter(results), times2);

For smart output iterators, our tested code is this:

numbers >>= fluent::to_output >>= fluent::output::transform(times2) >>= back_inserter(results);

To run our benchmarks we use Fred Tingaud’s popular Quick-Bench.com.

Here are the results for clang with various levels of optimization flags:

And for gcc:

Here is the benchmark, for reference.

Those results show that, on this use case, ranges and smart output iterators tend to be in the same ballpark, and with clang the STL algorithm seems to have an edge over both of them.

`filter` then `transform`

Let’s try a more elaborate case, by chaining two operations, filter then transform.

For this we introduce a predicate to filter on:

bool isEven(int x)
{
    return x % 2 == 0;
}

For ranges, our tested code is this:

ranges::push_back(results, numbers | ranges::view::filter(isEven) | ranges::view::transform(times2));

For STL algorithms, our tested code is this:

std::copy_if(begin(numbers), end(numbers), back_inserter(filteredNumbers), isEven);
std::transform(begin(filteredNumbers), end(filteredNumbers), back_inserter(results), times2);  }

For smart output iterators, our tested code is this:

numbers >>= fluent::to_output >>= fluent::output::filter(isEven) >>= fluent::output::transform(times2) >>= back_inserter(results);

Here are the results for clang:

And for gcc:

This gives consistent observations with the previous use case with transform only.

Here is the complete code for this benchmark.

`transform` then `filter`

Finally, let’s swap filter and transform in order to apply transform first and filter after it.

We have to change our predicate because all numbers that has been multiplied by 2 is even. So we take the following predicate:

bool isMultiple4(int x)
{
    return x % 4 == 0;
}

For ranges, our tested code is this:

ranges::push_back(results, numbers | ranges::view::transform(times2) | ranges::view::filter(isMultiple4));

For STL algorithms, our tested code is this:

std::transform(begin(numbers), end(numbers), back_inserter(transformedNumbers), times2);
std::copy_if(begin(transformedNumbers), end(transformedNumbers), back_inserter(results), isMultiple4);

For smart output iterators, our tested code is this:

numbers >>= fluent::to_output >>= fluent::output::transform(times2) >>= fluent::output::filter(isMultiple4) >>= back_inserter(results);

Here are the results for clang:

And for gcc:

This also gives consistent observations compared to the previous use cases.

Output iterators are in the ballpark

Those simple benchmarks suggest that smart output iterators can compare with ranges, in terms of performance. In some cases they went a bit faster, in some others a bit slower.

As always with performance, write the code with the best design possible, and if the application gets slow then identify the bottleneck(s) by running it through a profiler and act on those specifically.

This analysis was for the common features between both, such as transform and filter. That said, ranges and smart output iterators each have their specificities such as zip and unzip, that don’t exist in the other. In those cases, the choice between the libraries is already made.

About Jonathan Boccara

Performance benchmark: Ranges VS STL algorithms VS Smart output iterators

`transform`

`filter` then `transform`

`transform` then `filter`

Output iterators are in the ballpark

You will also like

About Jonathan Boccara

Performance benchmark: Ranges VS STL algorithms VS Smart output iterators

transform

filter then transform

transform then filter

Output iterators are in the ballpark

You will also like

`transform`

`filter` then `transform`

`transform` then `filter`