Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

branchless advantage is underrepresented #311

Open
IAmAThousandTrees opened this issue Dec 3, 2024 · 0 comments
Open

branchless advantage is underrepresented #311

IAmAThousandTrees opened this issue Dec 3, 2024 · 0 comments

Comments

@IAmAThousandTrees
Copy link

In sections 3.2 and 3.3 a little experimental example is portrayed and in order to prevent optimizations of the compiled code the summing variable is declared as volatile. This has the side-effect of forcing it into using a memory-targeted add operation. Whilst it is true that in a case like this where there is a memory latency in the result-chain a 75% prediction rate is sufficient to make branching better if the branch can bypass the memory chained part, if there's no false necessity imposed for the memory chain, branchless suddenly becomes better in all cases. It is also the case that modern compilers (some of them anyway), without the imposed volatile restriction will mostly convert the branching code to branchless, then realise it can vectorise, and produce something 100 times faster than the branching option... ...whether or not the code implies branching or branchless.

I can also falsify the later statement:

We can rewrite branchy code using the ternary operator or various arithmetic tricks, which acts as sort of an implicit contract between programmers and compilers: if the programmer wrote the code this way, then it was probably meant to be branchless.

In my tests both gcc and clang ignore how it was written. gcc only produces branching code (in my experiments - apparently it varies by version, but I didn't find that) and clang only produces branchless, whether written with a ternery ?, if(cond) sum = res; or if(cond) sum+=arr[i];

otherwise, it was a nice little jaunt through branch prediction. I found it looking for any figures available of what the actual penalty for branch misspredict was: this was the only thing that came up...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant