How Smart Output Iterators Avoid the TPOIASI
In the last post we saw the TPOIASI, or Terrible Problem Of Incrementing A Smart Iterator, that could incur a performance cost in code that uses range adaptors. Today, we’ll see how smart output iterators fare with the TPOIASI (spoiler: they have a way to avoid the problem).
Now if you’re wondering what smart iterators, smart output iterators or the Terrible Problem Of Incrementing Them is, here is a little refresher.
The TPOIASI
The TPOIASI occurs when an iterator that embeds logic in its operator++
(for instance, advancing to the next element that satisfies a predicate), is plugged onto another iterator, for example one that applies a function in its operator*
.
In a range-style code, the situation looks like this:
// Input vector std::vector<int> numbers = {1, 2, 3, 4, 5}; // Output vector std::vector<int> results; //Apply transform and filter ranges::push_back(results, numbers | ranges::view::transform(times2) | ranges::view::filter(isMultipleOf4)); // Display results for (auto result : results) { std::cout << result << ' '; }
with times2
and isMultipleOf4
being:
int times2(int n) { std::cout << "transform " << n << '\n'; return n * 2; } bool isMultipleOf4(int n) { return n % 4 == 0; }
(note the trace in times2
).
The code outputs:
transform 1 transform 2 transform 2 transform 3 transform 4 transform 4 transform 5 4 8
For some elements, 2
and 4
, the function is called more than once. This is a problem. And a terrible one because it is – in my opinion – structural to this range adaptor.
We had seen that the source of the problem is that the operator++
of filter
that has to peek ahead to know where to stop, and then its operator*
calls up the transform
function again.
If you’d like to read more about the Terrible Problem Of Incrementing A Smart Iterator, you can have a look at its dedicated article.
Smart output iterators
Smart output iterators are a symmetrical approach to range adaptors, to manipulate collections in C++. This means that while range adaptors operate on input iterators and can funnel data into an STL algorithm, smart output iterators put some logic inside of the output iterators of an algorithm.
Take std::back_inserter
for instance. It is an output iterator that embeds a push_back
to a container. Smart output iterators generalise this idea by allowing output iterators to apply functions, filter on predicates, and a lot of other fancy treatments, to the data coming out of STL algorithms.
For instance, the equivalent code to the one above that used range adaptors would be, with smart output iterators:
// Input vector std::vector<int> numbers = {1, 2, 3, 4, 5}; // Output vector std::vector<int> results; //Apply transform and filter auto oIsMultiple4 = make_output_filter(isMultiple4); auto oTimes2 = make_output_transformer(times2); copy(numbers, oTimes2(oIsMultiple4(back_inserter(results)))); // Display results for (auto result : results) { std::cout << result << ' '; }
Now do smart output iterators suffer from the TPOIASI? Do they call the function in transform
multiple times?
When we look at the implementation of the output iterator that filters, its operator++
and operator*
implementations are pretty ascetic (like for all output iterators):
template<typename Iterator, typename Predicate> class output_filter_iterator { public: explicit output_filter_iterator(Iterator iterator, Predicate predicate) : iterator_(iterator), predicate_(predicate) {} output_filter_iterator& operator++(){ ++iterator_; return *this; } output_filter_iterator& operator*(){ return *this; } template<typename T> output_filter_iterator& operator=(T const& value) { if (predicate_(value)) { *iterator_ = value; } return *this; } private: Iterator iterator_; Predicate predicate_; };
No checking of the predicate, no reading from the underlying iterator.
Will this be enough to make them immune to the Terrible Problem?
Let’s run that code to find out.
Smart output iterators and the TPOIASI
Running the code with the same trace:
int times2(int n) { std::cout << "transform " << n << '\n'; return n * 2; } bool isMultipleOf4(int n) { return n % 4 == 0; }
gives this output:
transform 1 transform 2 transform 3 transform 4 transform 5 4 8
No multiple calls to the function!
Does that mean that smart output iterators are immune to the Terrible Problem?
It’s not that simple. The above case appends data to an empty vector
, with the help of a back_inserter
. But if we change the use case a little, by writing into the vector rather than appending to it:
// Input vector std::vector<int> numbers = {1, 2, 3, 4, 5}; // Output vector std::vector<int> results = {0, 0, 0, 0, 0}; //Apply transform and filter auto oIsMultiple4 = make_output_filter(isMultiple4); auto oTimes2 = make_output_transformer(times2); copy(numbers, oTimes2(oIsMultiple4(begin(results)))); // Display results for (auto result : results) { std::cout << result << ' '; }
We would expect this:
4 8 0 0 0
But the result we get is in fact that:
0 4 0 8 0
This is a bug. It comes from the operator++
that increments the underlying iterator even if the smart output iterator ends up not writing to it (in the case where the value it is passed doesn’t satisfy the predicate).
Let’s attempt to fix this by changing the implementation of operator++
from this:
output_filter_iterator& operator++(){ ++iterator_; return *this; }
as it was above, to that:
output_filter_iterator& operator++(){ return *this; }
By not incrementing the underlying iterator.
The the result we get is now this:
8 0 0 0 0
This is still not good, because we’re never incrementing the underlying iterator, therefore we’re constantly writing at the same position.
No, we’d need to increment the filter iterator only if is has sent something to its underlying iterator. Let’s just write it then:
template<typename Iterator, typename Predicate> class output_filter_iterator { public: explicit output_filter_iterator(Iterator iterator, Predicate predicate) : iterator_(iterator), predicate_(predicate) {} output_filter_iterator& operator++(){ return *this; } output_filter_iterator& operator*(){ return *this; } template<typename T> output_filter_iterator& operator=(T const& value) { if (predicate_(value)) { *iterator_ = value; ++iterator_; } return *this; } private: Iterator iterator_; Predicate predicate_; };
Now when we run the code we get:
4 8 0 0 0
And does the case with back_inserter
still work? Let’s run it:
4 8
It does still work.
It all looks good except there is a nagging question left:
Is this OK?
Implementing the operator++
by incrementing the underlying sounded natural. Indeed, imagine that an algorithm decided to increment the output iterator twice before assigning it. A std::vector
iterator would skip an element, but our smart output iterator would completely be oblivious to that double-increment.
It turns out that it’s ok, because algorithms are not allowed to increment an output iterator twice without calling operator=
in between. Indeed, as we can read on cppreference.com, “Assignment through an output iterator is expected to alternate with incrementing. Double-increment is undefined behavior”.
I may well miss something, but this makes this implementation look ok to me, and smart output iterators have avoided the TPOIASI, which looks like a good sign for their design.
If you’d like to see the code of the smart output iterators library, it’s up on GitHub.
You may also like
- Smart output iterators
- The Terrible Problem Of Incrementing A Smart Iterator
Share this post!