Dealing with Multiple Paths with the Vector Monad in C++
After having explored how to deal with multiple error handling with the optional monad in C++, let’s take inspiration again from the functional programming world, and see our familiar std::vector
from a very unusal perspective.
Although this is an application of the concept of monads, we will focus on how to write code in C++, and not how to understand what monads are. Indeed, monads seem to be very difficult to explain. However, by reading it you may accidentally understand monads better (I certainly did), and I won’t do anything against it 🙂
This particular post shows a rather simple implementation, that leads to… not the best code possible. However the next post will show a more sophisticated approach, leading to very straightforward code.
I chose this structure in order to introduce all the involved concepts gradually. There is a lot to take in, and if you tried to wrap your head around all this at the same time, I was afraid your head would have ended up making a knot. And I wouldn’t want to be responsible for that!
Finally, to Render unto David the things that are David’s, let me mention that I came across the ideas in this post by watching this excellent talk from David Sankel. He presents a lot of interesting ideas in it, and I think it is worth delving deeper into some of them, which is our purpose today.
So many outcomes
This technique we explore here applies to functions that return several outputs, in the form of a collection.
For our example let’s use the three following functions:
std::vector<int> f1(int a); std::vector<int> f2(int b, int c); std::vector<int> f3(int d);
These functions correspond to several steps in a given computation. The fact that they return a collection may represent the idea that several values can come out of a function, for one given set of parameters. For example, various calibration parameters could be at play within the functions, and each calibration parameter would lead to a different result from the same input.
The purpose here is to take a given input, and work out all the possible outcomes that would be produced by calling these functions successively.
Let’s write a first attempt, that would feed the collection results
with all the results coming out of the functions:
std::vector<int> results; std::vector<int> b = f1(1); std::vector<int> c = f1(2); for (int bElement : b) { for (int cElement : c) { std::vector<int> d = f2(bElement, cElement); for (int dElement : d) { auto e = f3(dElement); std::copy(e.begin(), e.end(), std::back_inserter(results)); } } }
The above code does the job: each of the elements coming out of f1
are passed to f2
, and each of the element coming out of f2
from all those coming out of f1
are passed to f3
, and so on.
But this piece of code is bulky, cumbersome, and you can easily imagine that it doesn’t get better when more that three functions are involved in the process.
The vector monad
In fact, the above piece of code would get under some control if we could encapsulate the vectors traversals. And this exactly what the technique of the vector monad aims at doing.
The code to encapsulate this is the passing of the value returned from a function (which is a vector) to the next function taking an element and returning a vector. So let’s encapsulate this in a function taking these two elements. For chaining up several functions we use an operator rather than a plain function. And we choose operator>>=
because it is rarely used in C++ and also because it happens to be the one used in Haskell when dealing with monads.
Once again, this is not the optimal result of C++ yet, but let’s start with a simple (kind of) approach to get our feet wet, particularly for those not familiar with functional programming.
Here is the code:
template<typename T, typename TtoVectorU> auto operator>>=(std::vector<T> const& ts, TtoVectorU f) -> decltype(f(ts.front())) { decltype(f(ts.front())) us; for(T const& t : ts) { auto ft = f(t); std::copy(ft.begin(), ft.end(), std::back_inserter(us)); } return us; }
TtoVectorU
represents a callable type (such as a function or function object) that can be passed a T and return an std::vector<U>
. Which is just what we have in our example (with T and U both being int).
The trick now is not to pass the next function directly, but rather a lambda that does two things:
- calling the next function, and
- pursuing the chain by calling another lambda.
And here is what the resulting code looks like:
std::vector<int> results = f1(1) >>= [=](int b) { return f1(2) >>= [=](int c) { return f2(b, c) >>= [=](int d) { return f3(d); };};};
This code gives the same result as the previous one, but we see that it can grow better. While the first attempt indented deeper and deeper, and repeated vector traversals, this one only shows a chain of operations. And this is exactly what the initial problem was: a chain of operations.
Stay tuned for more on this, with a more sophisticated implementation using our friends the ranges, and leading to a much cleaner calling code.
Related articles:
- Multiple error handling with the optional monad in C++
- The optional monad in C++, without the ugly stuff
Share this post!