Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should up/down casting a uint(16) to an int work? #26301

Open
jabraham17 opened this issue Nov 22, 2024 · 4 comments
Open

Should up/down casting a uint(16) to an int work? #26301

jabraham17 opened this issue Nov 22, 2024 · 4 comments

Comments

@jabraham17
Copy link
Member

With the upgrade to LLVM 19, I came across the following frustrating bug. test/studies/parboil/SAD/sadSerial.chpl started to fail when run with --fast. I tracked this down to a single line of code

sad += abs(a:int - b:int):uint(16); // danger, need these casts

Essentially what was occurring is that with LLVM 19, the vectorizer was much more aggressive and would optimize away the cast to int (specifically, this was the slp-vectorizer pass). This in turn meant that the math was incorrect and could have underflow, if a was smaller than b.

Replacing this code with sad += if a > b then a - b else b - a; resolved the issue. But should that even have been required?

As far as I can tell, we generate nearly identical code as LLVM for this cast+abs logic.

In Chapel

proc doit(a: uint(16), b: uint(16)) {
  return abs(a:int - b:int):uint(16);
}
define internal i16 @doit_chpl(i16 %0, i16 %1) #0 {
  %3 = zext i16 %0 to i64
  %4 = zext i16 %1 to i64
  %5 = sub nsw i64 %3, %4
  %6 = tail call i64 @llvm.abs.i64(i64 %5, i1 true)
  %7 = trunc nuw i64 %6 to i16
  ret i16 %7
}

Here is a short gadget in C that has the same sort of casts

#include <stdint.h>
#include <stdlib.h>
#include <math.h>
uint16_t doit(uint16_t x, uint16_t y) {
    return (uint16_t)labs((int64_t)x - (int64_t)y);
}
define dso_local i16 @doit(i16 noundef %x, i16 noundef %y) local_unnamed_addr {
entry:
  %conv = zext i16 %x to i64
  %conv1 = zext i16 %y to i64
  %sub = sub nsw i64 %conv, %conv1
  %0 = tail call i64 @llvm.abs.i64(i64 %sub, i1 true)
  %conv2 = trunc nuw i64 %0 to i16
  ret i16 %conv2
}

declare i64 @llvm.abs.i64(i64, i1 immarg) #1

So I don't think this is due to us generating incorrect LLVM IR. It could be a bug in slp-vectorize pass, but I have not dug deep enough into it to determine that.

jabraham17 added a commit that referenced this issue Nov 22, 2024
Fixes `test/studies/parboil/SAD/sadSerial.chpl`, which is currently
failing due to the upgrade to LLVM 19.

This PR changes the test to use an if-statement to compute the
difference between 2 `uint(16)`, instead of an `abs` with up/down
casting. With LLVM 19, the vectorizer is more aggressive and breaks the
casting version. See #26301

[Reviewed by @dlongnecke-cray]
@bradcray
Copy link
Member

In case it's unclear from the issue title + context, a and b in the original line of code are of type uint(16).

IIRC, signed integer underflow/wraparound is not well-defined in C (and we inherit that behavior), so if we're relying on it here, that could definitely be problematic.

unsigned integer wraparound is well-defined, though. Does casting to uint work? (checking, I think the answer is "no"?)

@jabraham17
Copy link
Member Author

Casting to uint does not work.

sad += abs(a:uint - b:uint):uint(16); // danger, need these casts

The vectorizer has the same behavior on this and results in the casting being optimized away.

@bradcray
Copy link
Member

Sorry, thinking about this more and re-reading more carefully, I think you're saying the underflow only occurs when the casts are removed. Which makes sense since there shouldn't be any values if uint(16) that, when cast to int and subtracted, result in an underflow. (I think I mistakenly thought we were relying on underflow on the ints).

This does make it seem like a bug to remove the casts to me. I can't see a justification that an optimizer could legally do that.

@damianmoz
Copy link

damianmoz commented Nov 22, 2024

What happens when you do either/both of

/* C-code */
sad += (unsigned short) abs(((int) a) - (int) b));
sad += (unsigned short) abs(a - (int) b));

using clang19.

They should behave as I read the C standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants