Do Understandable If Statements Run Slower?

Published July 25, 2017 - 6 Comments

Aadam, my esteemed teammate, walked down to me right after reading the last post on Fluent C++, How to Make If Statements More Understandable, with a question. In fact this post made quite a few people think and get back to me with feedback and questions, for which I’m very grateful. If it did just that, then it’s already achieved one of its major goals.

Anyway let’s get to Aadam’s question: “Jonathan, he said, I get the idea of rolling out an if statment for it to match the specifications. But does this have any sort of impact on performance?”

This is a great question, and he wasn’t the only one bringing up this topic.

I had a hunch about the answer, but hunches are worth nothing when it comes to performance, right? So we did the only thing we could do: measure!

To perform all our measurements we’ve used Fred Tingaud’s popular tool: quick-bench.com.

Does the compiler understand understandable if statements?

We’ve selected one particular question for our measurements: we saw in the last post that sometimes, following the specifications leads us to have an if inside an if, as opposed to cramming two conditionals into a logical AND expression:

if (condition1)
{
    if (condition2)
    {
        ...

if (condition1 && condition2)
{
    ...

So does one have a better performance than the other one? And even before this: does the compiler understands that the two snippets are equivalent, and generates the same code for them?

We throw these two pieces of code into quick-bench, that also generates the assembly code for each one. The configuration is clang++ 3.8 launched with -O1 as an optimization flag. We used random numbers for the conditions, in order to make sure they were actually executed at runtime. Here is our quick-bench if you’re curious to have a look.

Here are the two pieces of assembly code that clang generated:

push   %r14
push   %rbx
push   %rax
mov    %rdi,%r14
callq  404ce0 <benchmark::State::KeepRunning()>
test   %al,%al
je     404ab6 <if_if(benchmark::State&)+0x56>
mov    $0x270f,%ebx
data16 nopw %cs:0x0(%rax,%rax,1)
callq  404b80 <getPositive()>
test   %eax,%eax
jle    404a9c <if_if(benchmark::State&)+0x3c>
callq  404be0 <getNegative()>
test   %eax,%eax
jle    404a9c <if_if(benchmark::State&)+0x3c>
movl   $0x2a,0x23442c(%rip)        # 638ec8 <c>
test   %ebx,%ebx
lea    -0x1(%rbx),%eax
mov    %eax,%ebx
jne    404a80 <if_if(benchmark::State&)+0x20>
mov    %r14,%rdi
callq  404ce0 <benchmark::State::KeepRunning()>
test   %al,%al
mov    $0x270f,%ebx
jne    404a80 <if_if(benchmark::State&)+0x20>
add    $0x8,%rsp
pop    %rbx
pop    %r14
retq

push   %r14
push   %rbx
push   %rax
mov    %rdi,%r14
callq  404ce0 <benchmark::State::KeepRunning()>
test   %al,%al
je     404b16 <if_and(benchmark::State&)+0x56>
mov    $0x270f,%ebx
data16 nopw %cs:0x0(%rax,%rax,1)
callq  404b80 <getPositive()>
test   %eax,%eax
jle    404afc <if_and(benchmark::State&)+0x3c>
callq  404be0 <getNegative()>
test   %eax,%eax
jle    404afc <if_and(benchmark::State&)+0x3c>
movl   $0x2a,0x2343cc(%rip)        # 638ec8 <c>
test   %ebx,%ebx
lea    -0x1(%rbx),%eax
mov    %eax,%ebx
jne    404ae0 <if_and(benchmark::State&)+0x20>
mov    %r14,%rdi
callq  404ce0 <benchmark::State::KeepRunning()>
test   %al,%al
mov    $0x270f,%ebx
jne    404ae0 <if_and(benchmark::State&)+0x20>
add    $0x8,%rsp
pop    %rbx
pop    %r14
retq

As you can see, except for the memory addresses this is exactly the same generated code. So with -O1, clang figures out that the two pieces of code are equivalent, and therefore they have the same performance.

Now let’s try with -O0 (no optimization):

push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %rdi,-0x8(%rbp)
mov    -0x8(%rbp),%rdi
callq  404d80 <benchmark::State::KeepRunning()>
test   $0x1,%al
jne    404962 <if_if(benchmark::State&)+0x22>
jmpq   4049b3 <if_if(benchmark::State&)+0x73>
movl   $0x2710,-0xc(%rbp)
mov    -0xc(%rbp),%eax
mov    %eax,%ecx
add    $0xffffffff,%ecx
mov    %ecx,-0xc(%rbp)
cmp    $0x0,%eax
je     4049ae <if_if(benchmark::State&)+0x6e>
callq  404ad0 <getPositive()>
cmp    $0x0,%eax
jle    4049a9 <if_if(benchmark::State&)+0x69>
callq  404b60 <getNegative()>
cmp    $0x0,%eax
jle    4049a4 <if_if(benchmark::State&)+0x64>
movl   $0x2a,0x638ecc
jmpq   4049a9 <if_if(benchmark::State&)+0x69>
jmpq   404969 <if_if(benchmark::State&)+0x29>
jmpq   40494c <if_if(benchmark::State&)+0xc>
add    $0x10,%rsp
pop    %rbp
retq

push   %rbp
mov    %rsp,%rbp
sub    $0x10,%rsp
mov    %rdi,-0x8(%rbp)
mov    -0x8(%rbp),%rdi
callq  404d80 <benchmark::State::KeepRunning()>
test   $0x1,%al
jne    4049e2 <if_and(benchmark::State&)+0x22>
jmpq   404a2e <if_and(benchmark::State&)+0x6e>
movl   $0x2710,-0xc(%rbp)
mov    -0xc(%rbp),%eax
mov    %eax,%ecx
add    $0xffffffff,%ecx
mov    %ecx,-0xc(%rbp)
cmp    $0x0,%eax
je     404a29 <if_and(benchmark::State&)+0x69>
callq  404ad0 <getPositive()>
cmp    $0x0,%eax
jle    404a24 <if_and(benchmark::State&)+0x64>
callq  404b60 <getNegative()>
cmp    $0x0,%eax
jle    404a24 <if_and(benchmark::State&)+0x64>
movl   $0x2a,0x638ecc
jmpq   4049e9 <if_and(benchmark::State&)+0x29>
jmpq   4049cc <if_and(benchmark::State&)+0xc>
add    $0x10,%rsp
pop    %rbp
retq

There is one more line in the code that has two ifs:

jmpq 4049a9 <if_if(benchmark::State&)+0x69>

which corresponds to a “jump”, the implemenatation of an if statement in assembly code.

Can the CPU live with understandable if statements?

Since the code is different, let’s see how this impacts the time of execution. Let’s give only positive values to a so that the inner if is always executed:

if statement expressive C++ performance

(this image was generated with quick-bench.com)

The version that has the two conditionals on the same line is about 7% faster! So in the case we followed a specifications that led us roll out an if statement like the one in this example, we’ve made the application slower. Blimey!

And now let’s test it with random values for a that can be 0 or 1 with equal probability:

C++ if performance expressive

(this image was generated with quick-bench.com)

This time the second version is about 2% faster, certainly because the execution doesn’t always reach the inner if.

Can I afford understandable if statements??

Let’s analyse the situation calmly.

First of all, if you’re compiling at a sufficient level of optimization, you’re fine. No performance penalty if you choose the if that matches your specifications better. Now the right level of optimization depends on your compiler, but in this experiment it was -O1 for clang. I’ve also generated the code for the latest version of gcc on godbolt (quick-bench doesn’t support gcc as of this writing) for the two ifs and for the if and AND expression. And while the code is also different for -O0, it becomes the same for -O1.

Now if you’re not compiling with optimization, maybe the faster one corresponds to your specifications, in which case you’re also fine. There is not one version of the if that is more understandable in itself, it depends on the flow of the spec.

If your specifications are expressed with the slower if, and this piece of code is not in a critical section for performance, you’re fine again. Indeed, as Scott Meyers explains it in Item 16 of More Effective C++, most of the code isn’t relevant for performance optimizations, and you need to profile your code to figure out which parts are. So 7%, or 2%, or whatever value corresponds to your architecture on that particular line can go completely unnoticed, and it would be a shame to sacrifice its expressiveness for it.

If a certain alignements of the planets causes that particular if to be the bottleneck of your program, then you have to change it. But when you do so, try to do it in a way that would make sense for the specifications. Consult with your domain people if necessary. This way you’re saving the readability of this piece of code in the future.

And if even that isn’t possible, only then can you forgo the readability of this particular line.

But before you get into that extreme situation, you will have saved hundreds of other if statements, that will live on a peaceful life and will thank you for it.

Don't want to miss out ? Follow:
Share this post!

About Jonathan Boccara

Do Understandable If Statements Run Slower?

Does the compiler understand understandable if statements?

Can the CPU live with understandable if statements?

Can I afford understandable if statements??

Comments are closed