Find with Custom Returns
Some STL algorithms have a default behaviour and also accept a custom value in order to have a custom behaviour.
For example, std::sort
orders the elements of a collection based on comparisons with operator<
by default, but it also accepts a custom function to perform comparisons:
std::sort(begin(v), end(v), std::greater{}); // sorts v in descending order
This is the main customisation point of algorithms. In particular, STL algorithms don’t allow to customise their return value or return type.
Arno Schödl from think-cell shared with me a presentation he made, where he talks about iterators, ranges, and the ranges library of his company.
Among the interesting ideas in this presentation, one struck me in particular: flexible algorithms returns. They allow to write more expressive code, and Arno illustrates this technique with the find
algorithm.
The STL find: iterator or end
When you think about it, find
has a strange name. Indeed, find
doesn’t guarantee that it will find what you looking for. The only thing it guarantees is that it will give it a go.
If it does find the value you’re looking for, it returns the iterator pointing to it. Otherwise it returns the end of the range you passed in:
auto position42 = std::find(begin(v), end(v), 42); if (position42 != end(v)) { // code using *position42 ...
find
could have been called try_to_find
, or in better English search
. It happens that search
is another algorithm, but that’s a completely different story.
Inserting a customisation point
Here is the code of find
. This is a modern find
, like the one coming with C++20’s ranges. It doesn’t take a begin and an end, but rather a range. But in essence, all the ideas here could work with a find
that takes a begin and an end:
template<typename InputRange, typename Value> decltype(auto) find(InputRange&& range, Value const& value) { for(auto it = begin(range); it != end(range); ++it) { if (*it == value) return it; } return end(range); }
Note that the above snippets omits it for clarity, but we should declare the end iterator in a separate statement so that we don’t have to recompute it every time in the loop:
template<typename InputRange, typename Value> decltype(auto) find(InputRange&& range, Value const& value) { auto itEnd = end(range); for(auto it = begin(range); it != itEnd; ++it) { if (*it == value) return it; } return itEnd; }
Following Arno’s idea, we introduce a customisation point in find
, so that we can make it return more elaborate return types and values.
To do that let’s introduce an indirection, with a policy in charge of returning a value out of find
:
template<typename ReturnPolicy, typename InputRange, typename Value> decltype(auto) find(InputRange&& range, Value const& value) { for(auto it = begin(range); it != end(range); ++it) { if (*it == value) return ReturnPolicy::onFound(it, range); } return ReturnPolicy::onNotFound(range); }
A policy is essentially one aspect of the function that can be customised. For a lot more about the important topic of policies, check out Andrei Alexandrescu’s famous book Modern C++ Design (my favourite C++ book).
Here we allow the caller of find
to pass in a template parameters containing specific behaviour for the types and values returned. find
passes all the information it has to this policy: the current iterator and the range.
As a first step, let’s pass in a policy that does the same thing as the standard find
: return an iterator if the value is found, return the end otherwise:
struct IteratorOrEnd { template<typename Iterator, typename Range> static auto onFound(Iterator&& iterator, Range&&) { return iterator; } template<typename Range> static auto onNotFound(Range&& range) { return end(range); } };
Now the standard find
is equivalent to calling our find
with IteratorOrEnd
:
auto position42 = find<IteratorOrEnd>(v, 42); if (position42 != end(v)) { // code using *position42 ...
Note that the compiler deduces the template parameters following ReturnPolicy
. We only have to specify the ReturnPolicy
, which is nice.
With this indirection in place, we can now make find
return other results, without changing the code of the algorithm itself.
Checking with optional
Checking against the end of the collection is only one possible way to check if the value was found. A similar approach but with a slightly different interface is to make find
return an optional.
We can achieve that with this policy:
struct OptionalIterator { template<typename Iterator, typename Range> static auto onFound(Iterator&& iterator, Range&&) { return std::make_optional(iterator); } template<typename Range> static auto onNotFound(Range&&) { return std::optional<decltype(begin(std::declval<Range>()))>{std::nullopt}; } };
The reason why we don’t just return std::nullopt
in onNotFound
is that we need to specify the type inside the optional. std::nullopt
in itself is not enough for the compiler to deduce the type of the optional, because all optionals use std::nullopt
.
So we work out the type of the iterator based on the type of the range: it is the type resulting of calling begin
on an instantiation of the Range.
With this policy, we no longer have to compare the return of find
with the end of the collection:
auto position42 = find<OptionalIterator>(v, 42); if (position42) { // code using **position42 ...
Not checking at all
Now if you know for sure that the element is in the collection, you can express this by writing that you expect find
to return a valid iterator.
In the case this doesn’t happen, we can for example use an assert or throw an exception:
struct ValidIterator { template<typename Iterator, typename Range> static auto onFound(Iterator&& iterator, Range&&) { return iterator; } template<typename Range> static auto onNotFound(Range&& range) { assert(false); return end(range); } };
At call site, the code would look like this:
auto position42 = find<ValidIterator>(v, 42); // code using *position42...
Returning more than an iterator
One of the examples in Arno’s presentation is to return more than an iterator. For example a view on the whole range from its first element up to the element corresponding to the found value.
The policy to achieve that looks like this:
struct ReturnHead { template<typename Iterator, typename Range> static auto onFound(Iterator&& iterator, Range&& range) { return tc::take(std::forward<decltype(range)>(range), iterator); } template<typename Range> static auto onNotFound(Range&& range) { return tc::take(std::forward<decltype(range)>(range), ranges::begin(range)); } };
The above code uses think-cell’s ranges library and not the standard ones, I think because it is tricky to deal with forwarding references of ranges with the standard library. The standard ranges adaptors only accept lvalues. think-cell ranges also accept rvalues and can move in the contents of the rvalues.
Other custom policies
In general, policies are a powerful tool to write generic code. What do you think of this kind of return type policies?
Do you see other useful policies for the find
algorithm? For other algorithms?
Boost ranges also offer some customisations on the return types, that would be interesting to explore in a future post.
You will also like
- What Books to Read to Get Better In C++
- RestMyCase: A C++ Library for Formatting String Cases
- On Using Guards in C++
- Variadic CRTP: An Opt-in for Class Features, at Compile Time
- Inserting Values to a Map with Boost.Assign
Share this post!