Mux: Zip Without Tuples
C++ offers many ways to operate on the elements of a collection.
But what about operating on the elements of two collections?
There is an STL algorithm that can take two collections: std::transform
. For example, if we want to multiply the respective elements of two collections we can use std::transform
like this:
auto const inputs1 = std::vector<int>{1, 2, 3, 4, 5}; auto const inputs2 = std::set<int>{10, 20, 30, 40, 50}; auto results = std::vector<int>{}; std::transform(begin(inputs1), end(inputs1), begin(inputs2), back_inserter(results), std::multiplies{});
And since C++17, std::transform
can also take 3 input collections.
But if we want to compose several steps in the algorithms, for example by multiplying only the elements whose sum is smaller than 42, then we can no longer use STL algorithms conveniently.
Back to the good old for loop:
auto const inputs1 = std::vector<int>{1, 2, 3, 4, 5}; auto const inputs2 = std::set<int>{10, 20, 30, 40, 50}; auto results = std::vector<int>{}; auto input1 = begin(inputs1); auto input2 = begin(inputs2); for (; input1 != end(inputs1) && input2 != end(inputs2); ++input1, ++input2) { if (*input1 + *input2 < 41) { results.push_back(*input1 * *input2); } }
Note that this code performs a check that we don’t access elements beyond the end of inputs2
(which std::transform
doesn’t). But apart from this advantage, it’s pretty ugly towards the beginning of the loop.
We need a better way.
zip
The logical next step after STL algorithms is range views. What do ranges have to offer when it comes to manipulating several collections?
One view that range-v3 offers (but that is not planned for C++20) is view::zip
. view::zip
takes any number of collections, and presents a view of std::tuple
s containing the elements of this collection.
We can then combine view::zip
with any other view. In our case we’ll use view::filter
and view::transform
:
auto const inputs1 = std::vector<int>{1, 2, 3, 4, 5}; auto const inputs2 = std::set<int>{10, 20, 30, 40, 50}; std::vector<int> results = ranges::view::zip(inputs1, inputs2) | ranges::view::filter([](std::tuple<int, int> const& values){ return std::get<0>(values) + std::get<1>(values) < 41; }) | ranges::view::transform([](std::tuple<int, int> const& values){ return std::get<0>(values) * std::get<1>(values); });
I’ve written out the tuple types to make it clear that tuples are being passed around, but we could hide them with auto
:
std::vector<int> results = ranges::view::zip(inputs1, inputs2) | ranges::view::filter([](auto&& values){ return std::get<0>(values) + std::get<1>(values) < 41; }) | ranges::view::transform([](auto&& values){ return std::get<0>(values) * std::get<1>(values); });
This using of auto
in lambdas is in C++14, but the ranges library requires C++14 anyway.
In C++17, we can also use structured bindings instead of std::get
. This adds a statement in the lambda but that could look nicer:
auto const inputs1 = std::vector<int>{1, 2, 3, 4, 5}; auto const inputs2 = std::set<int>{10, 20, 30, 40, 50}; std::vector<int> results = ranges::view::zip(inputs1, inputs2) | ranges::view::filter([](auto&& values){ auto const& [a,b] = values; return a + b < 41; }) | ranges::view::transform([](auto&& values){ auto const& [a,b] = values; return a * b; });
Why do ranges require tuples, to begin with?
Correct me if I’m wrong, but my understanding is that it’s because zip
simulates a range of assembled elements from the two input collections. And in that range, the assembled element can’t be floating about in the air, they need to be stored in something. They’re represented as tuples.
Still, it would be nice not to have to use tuples at all. mux
allows that.
mux
mux
is a new component of the pipes library. It takes several collections, traverses them and sends their respective elements to the next pipe in the pipeline.
It can be represented like this:
With the corresponding code:
auto const input1 = std::vector<int>{1, 2, 3, 4, 5}; auto const input2 = std::vector<int>{10, 20, 30, 40, 50}; auto results = std::vector<int>{}; pipes::mux(input1, input2) >>= pipes::filter([](int a, int b){ return a + b < 41; }) >>= pipes::transform(std::multiplies{}) >>= pipes::push_back(results);
As you can see no tuples are used.
Why doesn’t mux
have to use tuples? It comes from the design of the pipes library. Contrary to ranges, pipes don’t simulate ranges of assembled elements. They send data on to the next pipe. Therefore mux
sends the respective elements to the next pipe, as in a function call. No need for a tuple.
Ranges and pipes have different designs. It’s not that one is better or worse, they’re just different. This allows them to do well different things.
How mux
works
mux
it itself is a pretty dumb function: it merely assembles several ranges together:
template<typename... Ranges> struct mux_ranges { std::tuple<Ranges const&...> inputs; explicit mux_ranges(Ranges const&... inputs) : inputs(inputs...) {} }; template<typename... Ranges> auto mux(Ranges&&... ranges) { static_assert(sizeof...(Ranges) > 0, "There should be at least one range in mux."); return mux_ranges<std::decay_t<Ranges>...>(FWD(ranges)...); }
The part containing the logic is operator>>=
. Here is its implementation:
template<typename... Ranges, typename Pipeline, detail::IsAPipeline<Pipeline> = true> void operator>>= (mux_ranges<Ranges...> muxRanges, Pipeline&& pipeline) { auto const beginIterators = detail::transform(muxRanges.ranges, [](auto&& range){ return begin(range); }); auto const endIterators = detail::transform(muxRanges.ranges, [](auto&& range){ return end(range); }); for(auto iterators = beginIterators; !detail::match_on_any(iterators, endIterators); detail::increment(iterators)) { sendTupleValues(detail::dereference(iterators), pipeline); } }
Let’s analyse this code line by line:
auto const beginIterators = detail::transform(muxRanges.ranges, [](auto&& range){ return begin(range); }); auto const endIterators = detail::transform(muxRanges.ranges, [](auto&& range){ return end(range); });
We use the algorithm on tuples transform
to create a tuple of begin
and a tuple of end
iterators out of the incoming tuple of ranges.
for(auto iterators = beginIterators;
We create a tuple of iterators all initialised at the beginning of each of the incoming ranges.
!detail::match_on_any(iterators, endIterators);
We want to stop iterating over the incoming ranges as soon as one of them has reached its end.
Here is the implementation of match_on_any
:
template<typename... Ts> bool match_on_any(std::tuple<Ts...> const& tuple1, std::tuple<Ts...> const& tuple2) { auto matchOnAny = false; detail::for_each2(tuple1, tuple2, [&matchOnAny](auto&& element1, auto&& element2) { if (!matchOnAny && element1 == element2) { matchOnAny = true; } }); return matchOnAny; }
If you know the algorithm on tuple for_each2
, this code is pretty straightforward. It iterates over two tuples and checks whether they have at least one element in common.
Back to the implementation of operator>>=
:
detail::increment(iterators))
We increment each iterator, by using the simple for_each
this time:
template<typename... Ts> void increment(std::tuple<Ts...>& tuple) { for_each(tuple, [](auto&& element){ ++element; }); }
And finally:
{ sendTupleValues(detail::dereference(iterators), pipeline); }
There are two function at play here. The first one is dereference
, which is just a call to operator*
on each iterator of the tuple:
template<typename... Ts> auto dereference(std::tuple<Ts...> const& tuple) { return transform(tuple, [](auto&& element) -> decltype(auto) { return *element; }); }
And the second one is sendTupleValues
, which sends all the values in a tuple to a pipeline:
namespace detail { template<typename... Ts, typename Pipeline, size_t... Is> void sendTupleValues(std::tuple<Ts...> const& tuple, Pipeline& pipeline, std::index_sequence<Is...>) { send(std::get<Is>(tuple)..., pipeline); } } template<typename... Ts, typename Pipeline> void sendTupleValues(std::tuple<Ts...> const& tuple, Pipeline& pipeline) { detail::sendTupleValues(tuple, pipeline, std::make_index_sequence<sizeof...(Ts)>{}); }
Making pipes accept several values
Before mux
entered the library, the pipes such as filter
and transform
could only accept one value:
template<typename Predicate> class filter_pipe : public pipe_base { public: template<typename Value, typename TailPipeline> void onReceive(Value&& value, TailPipeline&& tailPipeline) { if (predicate_(value)) { send(FWD(value)..., tailPipeline); } } // rest of filter...
To be compatible with mux
, they need now to handle several values, by using variadic templates:
template<typename Predicate> class filter_pipe : public pipe_base { public: template<typename... Values, typename TailPipeline> void onReceive(Values&&... values, TailPipeline&& tailPipeline) { if (predicate_(values...)) { send(FWD(values)..., tailPipeline); } } // rest of filter...
Operating on several collections
mux
allows to work on several collections without using tuples. But it covers the most basic use case: putting together several collections and working on the paired elements.
But we can go further in this direction. For example by generating all possible combinations of elements of the input collections. This is what we’ll see in a future post, with cartesian_product
.
In the meantime, all your feedback is welcome on mux
! What do think about mux
? What would you change?
You will also like
- Introduction to the C++ Ranges Library
- Ranges: the STL to the Next Level
- The Surprising Limitations of C++ Ranges Beyond Trivial Cases
- The Demux Pipe
- Smart Output Iterators >>= become(Pipes)
Share this post!