Unzipping a Collection of Tuples with the “unzip” Smart Output Iterator
Smart output iterators are output iterators that do more than just sending a piece of data from an STL algorithm to a container. They can embed logic that relieves the algorithm of some of its responsibilities.
We have already seen examples of smart output iterators that apply a function or filter on a predicate.
Now let’s see an example of smart output iterator that breaks down pairs and tuples, so that all the first elements go to one direction, all the second elements to another direction, and so on.
Two motivating cases: separating key from values, and transposing a collection a tuples
Let’s see two motivating examples for breaking down collections of pairs and tuples into specific containers.
Pairs
A std::map
is a sorted collection of std::pair
s, whose first
s are keys and second
s are values. We want to send the keys and the values of the map to two distinct containers. And to leverage on the power of smart output iterators, let’s say that we also want to apply a function only on values.
To illustrate, let’s create a map that associates strings to numbers:
std::map<int, std::string> entries = { {1, "one"}, {2, "two"}, {3, "three"}, {4, "four"}, {5, "five"} };
We would like to:
- send the keys to
keys
, - send the values in upper case to
values
with keys
and values
starting as empty containers:
std::vector<int> keys; std::vector<std::string> values;
For this we need to implement the unzip
output iterator. We will also use the transform
iterator (formerly called output_transformer
) to apply a function to the output of the unzip
iterator:
auto const toUpper = fluent::output::transform(toUpperString); std::copy(begin(entries), end(entries), unzip(back_inserter(keys), toUpper(back_inserter(values))));
toUpperString
is a function that takes a std::string
and returns a std::string
that is the former one in upper case. It can be implemented like this:
std::string toUpperString(std::string const& s) { std::string upperString; std::transform(begin(s), end(s), std::back_inserter(upperString), [](char c){ return std::toupper(c); }); return upperString; }
And we would like keys
to contain {1, 2, 3, 4, 5}
, and values to contain {"ONE", "TWO", "THREE", "FOUR", "FIVE"}
.
Tuples
A more generic use case would use tuples instead of pairs. Here is a collection of tuples:
std::vector<std::tuple<int, int, int>> lines = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} };
In our example, this collection represents the lines of a table: the first line is 1 2 3, the second line is 4 5 6, and so on.
Let’s extract the columns of the table. To do this, we need to extract the first elements of each line and put them into a column1
container, then the second elements of each line and put them into a column2
container, and so on.
So our target code will be:
std::vector<int> column1, column2, column3; std::copy(begin(lines), end(lines), unzip(back_inserter(column1), back_inserter(column2), back_inserter(column3)));
And we expect column1
to hold {1, 4, 7, 10}
, column2
to hold {2, 5, 8, 11}
, and column3
to hold {3, 6, 9, 12}
.
Now that we have those two use cases for it, let’s implement the unzip
output iterator.
The unzip
output iterator
unzip
will follow the typical implementation of smart output iterators:
- the constructor keeps track of the underlying iterators to send data to,
operator*
returns the object itself, so that…- …
operator=
is called by the user (e.g. STL algorithm) and can perform the action of sending data off to the underlying iterators, operator++
forwards the increment to the underlying iterators.
So let’s start with the constructor:
template<typename... Iterators> class output_unzip_iterator { public: explicit output_unzip_iterator(Iterators... iterators) : iterators_(std::make_tuple(iterators...)) {} private: std::tuple<Iterators...> iterators_; };
We keep all the underlying iterators into a tuple
. Indeed, there could be any number of underlying iterators.
The operator*
does its job of allowing our smart output iterator to stay in the game when dereferenced:
output_unzip_iterator& operator*(){ return *this; }
The action then happens in operator=
, when the STL algorithms assigns to what is returned by dereferencing the iterator (so here, the iterator itself). Let’s start with the simpler case of sending an std::pair
to our iterator:
template<typename First, typename Second> output_unzip_iterator& operator=(std::pair<First, Second> const& values) { *std::get<0>(iterators_) = values.first; *std::get<1>(iterators_) = values.second; return *this; }
We forward the first (resp. second) of the incoming pair to the first (resp. second) underlying iterator.
The overload of operator=
that receives a std::tuple
is less straightforward to implement. Its prototype looks like this:
template<typename... Ts> output_unzip_iterator& operator=(std::tuple<Ts...> const& values) {
And in this function, we need to send each element of the incoming tuple to its corresponding element in our tuple of underlying iterators.
One way of formulating this is to apply to each pair of respective elements of those tuples a function that takes a value and an iterator, and that sends that value to that iterator.
So the problem comes down to applying a function taking two parameters to respective elements coming from two tuples.
Applying a function to the elements of two tuples
Note: We’re going to delve into template metaprogramming and variadic templates here. I’m not an expert, and if you know how to improve what follows, I’m happy to hear your feedback!
To apply a function to the elements of one tuple, C++17 offers std::apply
. But before C++17, there was a way to emulate std::apply
. We are going to look into this implementation, and adapt it for elements coming from two tuples.
To apply a function to the elements of a tuple, we can 1) unwrap the tuple into a variadic pack and 2) pass the contents of the variadic pack as arguments to a function.
Unwrapping the tuple into a variadic pack
To do this, we use C++14 index_sequence
:
template <class F, class Tuple1, class Tuple2> constexpr decltype(auto) apply2(F&& f, Tuple1&& t1, Tuple2&& t2) { return apply2_impl(std::forward<F>(f), std::forward<Tuple1>(t1), std::forward<Tuple2>(t2), std::make_index_sequence<std::tuple_size<std::remove_reference_t<Tuple1>>::value>{}); }
Passing the contents of a variadic pack as arguments to a function
apply2_impl
is a function that unwraps the contents of the tuples and pass them as parameters to f
:
template <class F, class Tuple1, class Tuple2, std::size_t... I> F apply2_impl(F&& f, Tuple1&& t1, Tuple2&& t2, std::index_sequence<I...>) { return (void)std::initializer_list<int>{(std::forward<F>(f)(std::get<I>(std::forward<Tuple1>(t1)), std::get<I>(std::forward<Tuple2>(t2))),0)...}, f; }
I reckon it is Sean Parent who came up with the technique of passing the contents of a variadic pack as arguments to a function without C++17. The above adapts that technique to a function that takes two parameters.
If you’re not familiar with variadic templates, I realize the above code must look not very different from this:
And it’s OK. You don’t need to understand those details to get the general meaning of the unzip
iterator, and to use it. However, this manipulation of compile time collections is an interesting topic, and we’ll dive into it in a later post with more explanations.
Anyway, the body of operator=
for our unzip
iterator is now:
output_unzip_iterator& operator=(std::tuple<Ts...> const& values) { apply2([](auto&& value, auto&& iterator){ *iterator = value; }, values, iterators_); return *this; }
One last thing to implement is the increment operator: operator++
. Here we make it forward the increment to its underlying iterators. So we need to apply a function that calls ++ on each element of the tuple. We could use std::apply
in C++17, and in C++14 we can resort to an implementation with the technique we saw before:
template <class F, class Tuple, std::size_t... I> F apply_impl(F&& f, Tuple&& t, std::index_sequence<I...>) { return (void)std::initializer_list<int>{(std::forward<F>(f)(std::get<I>(std::forward<Tuple>(t))),0)...}, f; } template <class F, class Tuple> constexpr decltype(auto) apply(F&& f, Tuple&& t) { return apply_impl(std::forward<F>(f), std::forward<Tuple>(t), std::make_index_sequence<std::tuple_size<std::remove_reference_t<Tuple>>::value>{}); }
And we use it this way:
output_unzip_iterator& operator++() { detail::apply([](auto&& iterator){ ++iterator; }, iterators_); return *this; } output_unzip_iterator& operator++(int){ ++*this; return *this; }
Finally let’s not forget the aliases for iterators:
using iterator_category = std::output_iterator_tag; using value_type = void; using difference_type = void; using pointer = void; using reference = void;
And the actual unzip
function that instantiates the iterator:
template<typename... Iterators> output_unzip_iterator<Iterators...> unzip(Iterators... iterators) { return output_unzip_iterator<Iterators...>(iterators...); }
And we’re good to go.
Unzipping pairs and tuples
Let’s now test our new iterator!
Our first use case was breaking down a collection of pairs into a collection of keys and a collection of values, and apply a function on values:
std::map<int, std::string> entries = { {1, "one"}, {2, "two"}, {3, "three"}, {4, "four"}, {5, "five"} }; std::vector<int> keys; std::vector<std::string> values; auto const toUpper = fluent::output::transform(toUpperString); std::copy(begin(entries), end(entries), unzip(back_inserter(keys), toUpper(back_inserter(values))));
When we output the contents of keys
we now get:
1 2 3 4 5
And when we output the contents of values
we get:
ONE TWO THREE FOUR FIVE
And our second case was using tuples, to breaks a collection of lines into a collection of columns:
std::vector<std::tuple<int, int, int>> lines = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9}, {10, 11, 12} }; std::vector<int> column1, column2, column3; std::copy(begin(lines), end(lines), unzip(back_inserter(column1), back_inserter(column2), back_inserter(column3)));
When we output the contents of column1
we get:
1 4 7 10
The outputs of column2
give:
2 5 8 11
And those of column3
are:
3 6 9 12
If you want to have a closer look at the code, you can check out the smart output iterators library, the implementation of the unzip
iterator, and the tests associated to it.
Related articles
Don't want to miss out ? Follow:   Share this post!