The SoA Vector – Part 1: Optimizing the Traversal of a Collection
Today’s guest post is the first part of a two-posts series written by Sidney Congard. Sidney is an almost graduated student and an intern at QuasarDB, a company writing it’s own database in C++17. He has been doing C++ in his free time regularly for two years.
Also interested in writing on Fluent C++? Submit your guest post!
I like C++ because it offers a good compromise between writing expressive and fast code. But, I discovered a problem where I didn’t know any way to hide the implementation detail away from its use: The “Structure of Arrays” (SoA) versus the “Array of Structures” (AoS) problem.
This is the first part of a series of two articles:
- what ‘SoA’ is about and what benefits it brings (part 1)
- how to implement an SoA vector in C++ (part 2)
So let’s see what those SoA and AoS are all about.
SoA and AoS
These terms designates two ways to lay out objects contiguously in memory. The AoS is the standard way to dot it. For example, with a class Person
:
struct person { std::string name; int age; };
If we use a standard vector:
std::vector<person> persons;
Then the layout of the objects in memory will look like this:
[name1, age1, name2, age2, ...]
This is the standard way. But there would be another way to store them: first all the names, then all the ages:
[name1, name2, ...], [age1, age2, ...]
This is SoA (Structure of Arrays.) This is no longer the layout of an std::vector
. Rather, it would be the layout of a structure like this:
struct persons { std::vector<std::string> names; std::vector<int> ages; };
The AoS is more conventional and more straightforward than the SoA. So what’s the point of the SoA?
The advantage of SoA
SoA increase performances on in a certain use case: the traversal of a collection that looks at one member of the object. For example, if we want to make every person one year older:
for (auto& person : persons) { ++person.age; }
If we use a traditional std::vector, then what the CPU will load in cache from memory is a chunk of the vector containing the entire objects:
[name1, age1, name2, age2, ...]
The cache line contains data that we won’t use: here, all the Person
s names. Since we only need their ages, this is a waste of the cache.
On the other hand, the SoA allows to load the ages packed together on the cache line:
[age1, age2, ...]
Which is more efficient.
Furthermore, SIMD operations (Single Instruction, Multiple Data) can be performed when we want to apply the same transformations to continuous objects: depending on the CPU properties, he can increment the ages 4 by 4, 8 by 8 or even 16 by 16.
Two questions may come to your mind when seeing this. The first one is: Does this really make a difference on performance?
The answer is Yes it happens to make a difference, for example in the video game industry.
And the second question would be: what would happen for traversals that look at more than one data member of the object, for example:
for (auto& person : persons) { std::cout << person.name << “ is “ << person.age << years old.\n”; }
With a traditional std::vector
, this traversal makes a full use of the loaded cache line:
[name1, age1, name2, age2, ...]
But with an SoA structure, the structure of the cache is not at all optimized for this code hopping back and forth between names and ages.
So which one of AoS or SoA is better for performance? The answer is that it depends on the use case. In the general case an AoS with an std::vector
is ok, but there are cases where SoA is necessary. This is why SoA is a thing.
To work efficiently with different data, an hybrid approach is possible, by using a single array storing the components in little arrays :
struct persons_block { std::array<8, std::string> names; std::array<8, int> ages; }; using persons = std::vector<persons_block>;
The memory layout looks then like that :
[names 1 to 8, ages 1 to 8, names 9 to 16, ages 9 to 16, ...]
With this approach, we can have the best of both world : good memory accesses and SIMD instructions while manipulating differents components at the same time.
Implementation of SoA in C++
But the problem with either form of SoA is that it doesn’t have the interface of a container. SoA or AoS are supposed to accomplish different trade-offs in terms of performance and, ideally, choosing between SoA and AoS should have very limited impact on the look of the code using the collection.
In the next post, we will design a C++ structure that implements SoA in while offering an interface close to the one of std::vector
.
You will also like
- A Summary of the Metaclasses Proposal for C++
- Make Your Containers Follow the Conventions of the STL
Share this post!