Find with Custom Returns

Published May 7, 2021

Some STL algorithms have a default behaviour and also accept a custom value in order to have a custom behaviour.

For example, std::sort orders the elements of a collection based on comparisons with operator< by default, but it also accepts a custom function to perform comparisons:

std::sort(begin(v), end(v), std::greater{}); // sorts v in descending order

This is the main customisation point of algorithms. In particular, STL algorithms don’t allow to customise their return value or return type.

Arno Schödl from think-cell shared with me a presentation he made, where he talks about iterators, ranges, and the ranges library of his company.

Among the interesting ideas in this presentation, one struck me in particular: flexible algorithms returns. They allow to write more expressive code, and Arno illustrates this technique with the find algorithm.

The STL find: iterator or end

When you think about it, find has a strange name. Indeed, find doesn’t guarantee that it will find what you looking for. The only thing it guarantees is that it will give it a go.

If it does find the value you’re looking for, it returns the iterator pointing to it. Otherwise it returns the end of the range you passed in:

auto position42 = std::find(begin(v), end(v), 42);
if (position42 != end(v))
{
    // code using *position42 ...

find could have been called try_to_find, or in better English search. It happens that search is another algorithm, but that’s a completely different story.

Inserting a customisation point

Here is the code of find. This is a modern find, like the one coming with C++20’s ranges. It doesn’t take a begin and an end, but rather a range. But in essence, all the ideas here could work with a find that takes a begin and an end:

template<typename InputRange, typename Value>
decltype(auto) find(InputRange&& range, Value const& value)
{
    for(auto it = begin(range); it != end(range); ++it)
    {
        if (*it == value) return it;
    }
    return end(range);
}

Note that the above snippets omits it for clarity, but we should declare the end iterator in a separate statement so that we don’t have to recompute it every time in the loop:

template<typename InputRange, typename Value>
decltype(auto) find(InputRange&& range, Value const& value)
{
    auto itEnd = end(range);
    for(auto it = begin(range); it != itEnd; ++it)
    {
        if (*it == value) return it;
    }
    return itEnd;
}

Following Arno’s idea, we introduce a customisation point in find, so that we can make it return more elaborate return types and values.

To do that let’s introduce an indirection, with a policy in charge of returning a value out of find:

template<typename ReturnPolicy, typename InputRange, typename Value>
decltype(auto) find(InputRange&& range, Value const& value)
{
    for(auto it = begin(range); it != end(range); ++it)
    {
        if (*it == value) return ReturnPolicy::onFound(it, range);
    }
    return ReturnPolicy::onNotFound(range);
}

A policy is essentially one aspect of the function that can be customised. For a lot more about the important topic of policies, check out Andrei Alexandrescu’s famous book Modern C++ Design (my favourite C++ book).

Here we allow the caller of find to pass in a template parameters containing specific behaviour for the types and values returned. find passes all the information it has to this policy: the current iterator and the range.

As a first step, let’s pass in a policy that does the same thing as the standard find: return an iterator if the value is found, return the end otherwise:

struct IteratorOrEnd
{
    template<typename Iterator, typename Range>
    static auto onFound(Iterator&& iterator, Range&&)
    {
        return iterator;
    }

    template<typename Range>
    static auto onNotFound(Range&& range)
    {
        return end(range);
    }
};

Now the standard find is equivalent to calling our find with IteratorOrEnd:

auto position42 = find<IteratorOrEnd>(v, 42);
if (position42 != end(v))
{
    // code using *position42 ...

Note that the compiler deduces the template parameters following ReturnPolicy. We only have to specify the ReturnPolicy, which is nice.

With this indirection in place, we can now make find return other results, without changing the code of the algorithm itself.

Checking with optional

Checking against the end of the collection is only one possible way to check if the value was found. A similar approach but with a slightly different interface is to make find return an optional.

We can achieve that with this policy:

struct OptionalIterator
{
    template<typename Iterator, typename Range>
    static auto onFound(Iterator&& iterator, Range&&)
    {
        return std::make_optional(iterator);
    }

    template<typename Range>
    static auto onNotFound(Range&&)
    {
        return std::optional<decltype(begin(std::declval<Range>()))>{std::nullopt};
    }
};

The reason why we don’t just return std::nullopt in onNotFound is that we need to specify the type inside the optional. std::nullopt in itself is not enough for the compiler to deduce the type of the optional, because all optionals use std::nullopt.

So we work out the type of the iterator based on the type of the range: it is the type resulting of calling begin on an instantiation of the Range.

With this policy, we no longer have to compare the return of find with the end of the collection:

auto position42 = find<OptionalIterator>(v, 42);
if (position42)
{
    // code using **position42 ...

Not checking at all

Now if you know for sure that the element is in the collection, you can express this by writing that you expect find to return a valid iterator.

In the case this doesn’t happen, we can for example use an assert or throw an exception:

struct ValidIterator
{
    template<typename Iterator, typename Range>
    static auto onFound(Iterator&& iterator, Range&&)
    {
        return iterator;
    }

    template<typename Range>
    static auto onNotFound(Range&& range)
    {
        assert(false);
        return end(range);
    }
};

At call site, the code would look like this:

auto position42 = find<ValidIterator>(v, 42);
// code using *position42...

Returning more than an iterator

One of the examples in Arno’s presentation is to return more than an iterator. For example a view on the whole range from its first element up to the element corresponding to the found value.

The policy to achieve that looks like this:

struct ReturnHead
{
    template<typename Iterator, typename Range>
    static auto onFound(Iterator&& iterator, Range&& range)
    {
        return tc::take(std::forward<decltype(range)>(range), iterator);
    }

    template<typename Range>
    static auto onNotFound(Range&& range)
    {
        return tc::take(std::forward<decltype(range)>(range), ranges::begin(range));
    }
};

The above code uses think-cell’s ranges library and not the standard ones, I think because it is tricky to deal with forwarding references of ranges with the standard library. The standard ranges adaptors only accept lvalues. think-cell ranges also accept rvalues and can move in the contents of the rvalues.