Comment by gameman144

Comment by gameman144 a day ago

1 reply

I am not sure that the before case maps to the article's premise, and and I think your optimized SIMD version does line up with the recommendations of the article.

For your example loop, the `if` statements are contingent on the data; they can't be pushed up as-is. If your algorithm were something like:

    if (length % 2 == 1) {
      values[i] += 1;
    } else {
      values[i] += 2;
    }

then I think you'd agree that we should hoist that check out above the `for` statement.

In your optimized SIMD version, you've removed the `if` altogether and are doing branchless computations. This seems very much like the platonic ideal of the article, and I'd expect they'd be a big fan!

sparkie a day ago

The point was more that, you shouldn't always try to remove the branch from a loop yourself, because often the compiler will do a better job.

For a contrived example, we could attempt to be clever and remove the branching from the loop in the first example by subtracting two from every value, then add three only for the odds.

    for (int i = 0; i < length ; i++) {
        values[i] -= 2;
        values[i] += (values[i] % 2) * 3;
    }
It achieves the same result (because subtracting two preserves odd/evenness, and nothing gets added for evens), and requires no in-loop branching, but it's likely going to perform no better or worse than what the compiler could've generated from the first example, and it may be more difficult to auto-vectorize because the logic has changed. It may perform better than an unoptimized branch-in-loop version though (depending on the cost of branching on the target).

In regards to moving branches out of the loop that don't need to be there (like your check on the length) - the compiler will be able to do this almost all of the time for you - this kind of thing is standard optimization techniques that most compilers implement. If you are interpreting, the following OPs advice is certainly worth doing, but you should probably not worry if you're using a mature compiler, and instead aim to maximize clarity of code for people reading it, rather than trying to be clever like this.