How to Check If 2 Sorted Collections Have a Common Element
Ah, the algorithms on sets! Such beautiful algorithms, and so useful too.
The algorithms on sets are basically the algorithms that take sorted collections and compare them in linear time. The STL offers five algorithms on sets: std::set_difference
, std::set_intersection
, std::set_union
, std::set_symmetric_difference
, and std::includes
.
If you’re a C++ developer, you absolutely, positively, unquestionably need to know your algorithms on sets.
You need to know the algorithms on sets of the STL, but it’s also beneficial to understand how they’re implemented. This lets us create new algorithms on sets.
Indeed, what the STL offers is a good start, but there are many more things we could do on sets to make our everyday coding tasks easier, and that is not in the STL.
In particular, if you’d like to know whether two given sorted collections have an element in common, you’re pretty much left stranded. You could perform a set::intersection
and check whether the output is empty or not, but that sounds like a lot of unnecessary work.
To this end, let’s see how to implement share_element
, an algorithm that takes two sorted collection and returns a boolean indicating whether they have an element in common.
Thanks to Fluent C++ subscriber Kai-Moritz Kumkar for bringing up the need for share_element
!
This post is part of the series on sets:
- How to check if 2 sorted collections have a common element
set_match
: Matching up Elements Between Sorted Collections- Implementing
set_match
in One Line of Code - STL algorithms on sets: one algorithm to implement them all
- Algorithms on set returning a boolean: exploring the algorithms
- Algorithms on set returning a boolean: implementing a generic algorithm
- Algorithms on set returning a boolean: a strong template interface
- NWH: Adapting algoritms on sets
A generic algorithm for comparing sets
What we call “sets” here is sorted collections. This includes std::set
s, but also sorted std::vector
s for example.
All algorithms that compare sets have the same type of implementation: iterate in set 1 while encountering elements that are smaller that the first one of set 2. Then iterate on set 2 while encountering elements that are smaller than the one we stopped at in set 1. Then iterate in set 1 again, and so on. And during those iterations, extract the information you need: for set_difference
, that would be the elements are only in set 1 for example.
I made a video to illustrate this kind of algorithm, you can check it out here.
This algorithm takes advantage of the facts that the two collections are sorted, which gives it a linear complexity (size1 + size2). If the collections were not sorted we’d have to check the whole collection 2 for each element of the collection 1, which would give a quadratic complexity (size1 * size2).
Some time ago we saw a generic algorithm on sets: set_segregrate
. set_segregrate
takes two sorted collections, and outputs three: the elements that are only in the first sorted collection, the elements that are only in the second one and the elements that are in both:
To implement set_shared_element
, we can get inspired from the code of set_segregate
. Indeed, for share_element
we’re interested in identifying if there is anything in what set_segregate
would output in the “Both” result.
Here is the implementation of set_segregate
. The line highlighted in blue is the one where the algorithm outputs results in “Both”:
template<class SetA, class SetB, class OutputOnlyA, class OutputBoth, class OutputOnlyB, class Compare, class AddToBoth> void set_segregate_impl(SetA&& setA, SetB&& setB, OutputOnlyA&& onlyA, OutputBoth&& both, OutputOnlyB&& onlyB, Compare comp, AddToBoth addToBoth) { auto xA = setA.begin(); auto xB = setB.begin(); while (xA != setA.end() && xB != setB.end()) { if (comp(*xA, *xB)) { *onlyA++ = *xA++; } else if (comp(*xB, *xA)) { *onlyB++ = *xB++; } else { *both++ = addToBoth(*xA++, *xB++); } } std::copy(xA, end(setA), onlyA); std::copy(xB, end(setB), onlyB); }
share_element
We can adapt this code for our purpose. Indeed, it does much more than what we need for share_element
. We can trim it down by making it return a bool
, replace the place where it fills the “Both” collection with a return true
, and those where it didn’t find anything in common with return false
:
We can then reorder this code to simplify it:
template<class SetA, class SetB, typename Compare> bool share_element(SetA&& setA, SetB&& setB, Compare comp) { auto xA = setA.begin(); auto xB = setB.begin(); while (xA != setA.end() && xB != setB.end()) { if (comp(*xA, *xB)) { ++xA; } else if (comp(*xB, *xA)) { ++xB; } else { return true; } } return false; }
That’s about it for the logic of the algorithm.
Comparing with operator<
by default
In the above code we’ve used a generic comparator, defined by the template parameter Compare
. But often there is a natural way to compare elements: using operator<
. Like STL algorithms, let’s provide a second overload of share_element
, that uses operator<
for comparisons:
template<class LeftRange, class RightRange> bool share_element(LeftRange const& leftRange, RightRange const& rightRange) { return share_element(leftRange, rightRange, std::less<>{}); }
This overload relies on the magic of std::less<>
.
Better than code inspiration, code reuse
Many algorithms on sets, including the STL’s set_difference
, set_union
, set_intersection
and set_symmetric_difference
can be implemented with set_segregate
.
On the other hand, we didn’t implement share_element
with set_segregate
. We only got inspired by its code. Is there an even more generic algorithm than set_segregate
, that both set_segregate
and share_element
could reuse for their implementation?
A first step in this direction is to have a generic algorithm that performs checks on sets, returning a boolean. Indeed, like share_element
, std::includes
also returns a bool
and is not implementable with set_segregate
.
Maybe there is a counterpart to set_segregate
for performing checks on collections, that std::includes
and share_element
could reuse in their implementations, and leading to new algorithms?
This is what we explore in future posts. In the meantime if you have an opinion on this, please let me know in the comments section. And if you’d like to contribute to the research on such topics, consider becoming a Patron of Fluent C++!
Stay tuned!
You will also like
- STL algorithms on sets
- How std::set_difference is implemented (video)
set_aggregate
,set_segregate
: higher-level algorithms on sets
Share this post!