Which One Is Better: Map of Vectors, or Multimap?
While advising on how to make code more expressive on the SFME project, I came across an interesting case of choosing the right data structure, which I’ll share with you with the permission of the authors of the projects.
We had to associate a key with several values, and perform various operations. Should we use a map of vectors, or is a multimap more appropriate? Let’s see the case in more details, and compare the two solutions.
The case: an event mediator
The interface for this event system has three functions:
1- void subscribe(EventReceiver const& receiver, EventID eventID)
This is the method to register a receiver to a certain type of event. When this type of event occurs, the event manager notifies the EventReceiver
on its (virtual) method reactTo
.
2- void emit(Event const& event) const
This method gets called by the sender of an event when an event occurs. The method calls the reactTo
method of all the clients that registered for its event ID.
3- bool isRegistered(EventReceiver const& receiver) const
At any time, we can query the event manager to know whether a given EventReceiver
has subscribed to the it (on any event).
(Note that this is a simplified version of the spec for SFME, so that we can focus on the data structure without spending more time understanding the rest of the components).
Given that specification, what data structure should the event manager use to represent the event IDs and the receivers?
It sounds natural to somehow associate event IDs with receivers, by using a map. But we can’t just use std::map<EventID, Receiver const*>
, because an event ID can have more than one receiver.
We’re going to explore two alternative designs and see which one fits most for our event manager:
- a map of vectors:
std::map<EventID, std::vector<EventReceiver const*>>
- a multimap:
std::multimap<EventID, EventReceiver const*>
Design 1: A map of vectors
This is probably the most natural design: each event ID can have several receivers, so we map an event ID to a collection of receivers:
class EventMediator { public: void subscribe(EventReceiver const& receiver, EventID eventID); void emit(Event const& event) const; bool isRegistered(EventReceiver const& receiver) const; private: std::map<EventID, std::vector<EventReceiver const*>> receiversRegistry_; };
How would the code of the event manager’s methods look like with that representation? Let’s see the implementation of the three methods: subscribe
, emit
and isRegistered
.
subscribe
The subscribe
method finds the entry of the map that corresponds to the event ID, and adds a receiver to the corresponding vector or receivers:
void EventMediator::subscribe(EventReceiver const& receiver, EventID eventID) { receiversRegistry_[eventID].push_back(&receiver); }
Simple enough.
emit
The emit
method picks out the collection of receivers that correspond to the event ID of the event happening, and invoke them all on their reactTo
method:
void EventMediator::emit(Event const& event) const { auto eventID = event.getEventID(); auto const& receivers = receiversRegistry_[eventID]; for (auto const& receiver : receivers) { receiver.reactTo(event); } }
Simple too. But this time, the code doesn’t compile and triggers the following error:
error: no viable overloaded operator[] for type 'const std::map<EventID, std::vector<const EventReceiver *> >'
Behind its rough shell, what this error message is trying to tell us is that we want emit
to be a const
method, but operator[]
is not const
on the map. Indeed, if the map doesn’t have an entry corresponding to the queried event ID, operator[]
will insert it for us and return a reference to it.
The code to fix the method is less pleasant to the eye:
void EventMediator::emit(Event const& event) const { auto eventID = event.getEventID(); auto receiversEntry = receiversRegistry_.find(eventID); if (receiversEntry != end(receiversRegistry_)) { auto const& receivers = receiversEntry->second; for (auto const& receiver : receivers) { receiver->reactTo(event); } } }
It consists of searching for the event ID, and if we find it in the map then we iterate over the corresponding collection. Note that the nestedness of this piece of code reflects the nestedness of a vector inside a map.
isRegistered
The isRegistered
method checks if a receiver is registered somewhere in the event manager. Since the map isn’t sorted by receivers but only by event IDs (because that’s its key), we need to perform a linear search across the whole structure: check the first vector, then the second, and so on:
bool EventMediator::isRegistered(EventReceiver const& searchedReceiver) const { for (auto const& receiversEntry : receiversRegistry_) { auto const& receievers = receiversEntry.second; for (auto const& receiver : receievers) { if (receiver == &searchedReceiver) { return true; } } } return false; }
Here also, the fact that the data structure is nested leads to a nested code.
The implementation of subscribe
is fine, but those of emit
and isRegistered
could use some simplification, in particular by making them less nested and more straightforward.
Let’s flatten out our data structure by using a multimap instead of a map of vectors.
Design 2: a multimap
A multimap?
What is a multimap, to begin with? It’s like a map, except that a map can only have at most one entry for each key, whereas a multimap can have several entries with equivalent keys.
To illustrate, let’s try to add several entries that have the same key to a std::map
:
auto entries = std::map<int, std::string>{}; entries.insert(std::make_pair(1, "one")); entries.insert(std::make_pair(1, "uno")); entries.insert(std::make_pair(2, "two")); entries.insert(std::make_pair(2, "dos")); entries.insert(std::make_pair(3, "three")); entries.insert(std::make_pair(3, "tres"));
If we display what the map contains with the following code:
for (auto const& entry : entries) { std::cout << entry.first << '-' << entry.second << '\n'; }
Here is what the code outputs:
1-one 2-two 3-three
For each of the keys (1, 2, 3) there is one entry in the map. Now if we replace the map by a multimap:
auto entries = std::multimap<int, std::string>{}; ...
Then the code now outputs:
1-one 1-uno 2-two 2-dos 3-three 3-tres
There are several entries with equivalent keys.
Replacing the map of vectors with a multimap
In our case, we can use a multimap to associate event IDs with receivers, because some event IDs can be associated with several receivers:
class EventMediator { public: void subscribe(EventReceiver const& receiver, EventID eventID); void emit(Event const& event) const; bool isRegistered(EventReceiver const& receiver) const; private: std::multimap<EventID, EventReceiver const*> receiversRegistry_; };
Let’s now rewrite our three methods subscribe
, emit
and isRegistered
to see if this new data structure simplifies their implementation.
subscribe
First of all, the standard multimap doesn’t have an operator[]
: indeed, it is possible that more than one value comes out of a lookup into the multimap. So we have to use the insert
method:
void EventMediator::subscribe(EventReceiver const& receiver, EventID eventID) { receiversRegistry_.insert(std::make_pair(eventID, &receiver)); }
Which is arguably not as elegant as the implementation using operator[]
that we had with the map of vectors. Let’s see how emit
and isRegistered
do.
emit
Here is the code for the emit
function to work with the multimap, we’ll go through it line by line:
void EventMediator::emit(Event const& event) const { auto eventID = event.getEventID(); auto receiversEntries = receiversRegistry_.equal_range(eventID); for (auto receiverEntry = receiversEntries.first; receiverEntry != receiversEntries.second; ++receiverEntry) { auto const& receiver = receiverEntry->second; receiver->reactTo(event); } }
EDIT: as observed by Romain Charbit in the comments section, an std::for_each
combined with C++14’s auto
in lambdas makes a more concise version:
void EventMediator::emit(Event const& event) const { auto eventID = event.getEventID(); auto receiversEntries = receiversRegistry_.equal_range(eventID); std::for_each(receiversEntries.first, receiversEntries.second, [&event](auto receiverEntry const&) { auto const& receiver = receiverEntry->second; receiver->reactTo(event); }); }
In case you’re not yet familiar with the interface of multimap, here is a line-by-line explanation of the above of code:
auto receiversEntries = receiversRegistry_.equal_range(eventID);
When we query a multimap for a key, we don’t expect to get a value back. Indeed, since the multimap could hold several entries for that key, we get a range of entries, which is a slice of the data inside of the multimap:
This sliced could be empty if there wasn’t any entry corresponding to the queried key.
for (auto receiverEntry = receiversEntries.first; receiverEntry != receiversEntries.second; ++receiverEntry)
While it makes sense that equal_range
returns a range, the format of the range returned by the STL here is… not as natural. We would have expected a structure that represents a range, that would have a begin
and end
interface, but instead equal_range
returns a pair of iterators. The first represents the beginning of the range and the second the end.
This integrates badly with the for loop (and with everything else for that matter), hence the complicated above line to simply express “iterate over that range”. Anyway, that’s an issue with the STL that we had already come across when discussing equal_range to search in STL containers.
auto const& receiver = receiverEntry->second;
receiverEntry
is an iterator to an entry in the multimap. The multimap contains std::pair
s of event IDs and receivers, so to get the receiver we take the second
of that entry.
receiver->reactTo(event);
We finally notify the receiver with the event.
Even with the glitch with the interface returned by equal_range
, this code is overall more straightforward than the emit
we had with the map of vectors. Indeed, we benefit that the structure is not nested to have code that is not nested either.
isRegistered
As in with the map of vectors, our data structure is still not sorted by receiver. So we have to traverse it linearly and search for a given receiver.
But this time, we only have one layer to traverse, which makes it easier to use an STL algorithm. We could use std::find_if
, but since we don’t need the location of the searched receiver but only whether it is there or not, std::any_of
will let us go straighter to the point:
bool EventMediator::isRegistered(EventReceiver const& queriedReceiver) const { auto hasQueriedReceiver = [&queriedReceiver](auto const& receiverEntry){ return receiverEntry.second == &queriedReceiver; }; return std::any_of(begin(receiversRegistry_), end(receiversRegistry_), hasQueriedReceiver); }
Or, with a range-based for loop:
bool EventMediator::isRegistered(EventReceiver const& queriedReceiver) const { for (auto const& receiverEntry : receiversRegistry_) { if (receiverEntry.second == &queriedReceiver) { return true; } } return false; }
Which are both simpler than the nested version of the map of vectors.
Note that the multimap is probably slightly larger in memory than the map of vectors, because the map of vectors only stores one key for each type of event. But until your memory profiler proved that this extra space is indeed significant (keys are often small, and you may not know the number of values per equivalent key), don’t refrain from writing the simplest code.
Map of vectors or multimap?
Even if the map of vectors is maybe more natural to think of at first, the multimap leads to simpler code as soon as we need to iterate over the data. This advantage of the multimap come from the fact that it is not a nested structure, contrary to the map of vector.
But does a nested structure always has to lead to nested code? Not necessarily. If you can abstract the fact that it’s a nested structure behind a range interface, then the code can behave as if it operated on a flat structure.
One thing that performs this type of abstraction is the join
range adaptor in range-v3. It can view a vectors of vectors as a flat range that features smart iterators that jumps off a vector to the next and carry out a full traversal of the nested collection as if it was flat.
join
works on vectors of vectors. But can it work on maps of vectors? There is an additional level of complexity.
Anyway, until we have that sort of components in production, nested structures produce nested code, and flat structures produce flat code. The apple doesn’t fall far from the tree.
Thanks to Roman for asking my advice to make code more expressive on the SFME project.
You may also like
Don't want to miss out ? Follow:   Share this post!