Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Fold X > C2 ? X + C1 : C2 + C1 to max(X, C2) + C1 #116888

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

veera-sivarajan
Copy link

@veera-sivarajan veera-sivarajan commented Nov 19, 2024

Fixes #82414

General Proof: https://alive2.llvm.org/ce/z/ERjNs4
Proof for Tests: https://alive2.llvm.org/ce/z/K-934G

This PR transforms select instructions of the form select (Cmp X C1) (BOp X C2) C3 to BOp (min/max X C1) C2 iff C3 == BOp C1 C2.

This helps in eliminating a noop loop in rust-lang/rust#123845 but does not improve optimizations.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot
Copy link
Member

llvmbot commented Nov 19, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Veera (veera-sivarajan)

Changes

General Proof: https://alive2.llvm.org/ce/z/ERjNs4
Proof for Tests: https://alive2.llvm.org/ce/z/0Y6LLH

This PR transforms select instructions of the form (select (Cmp X C1) (BOp X C2) C3) to (BOp (select (Cmp X C1) X C1) C2) iff C3 == BOp C1 C2. Which in turn gets canonicalized to (BOp (min/max X C1) C2).

This helps in eliminating a noop loop in rust-lang/rust#123845 but does not improve optimizations.

This PR does not follow conventional patterns because it essentially unfolds a constant when every other part of the compiler is conditioned to fold constant operations. More details below.


Full diff: https://github.com/llvm/llvm-project/pull/116888.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+41)
  • (added) llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll (+373)
  • (modified) llvm/test/Transforms/InstCombine/saturating-add-sub.ll (+3-4)
  • (modified) llvm/test/Transforms/InstCombine/select.ll (+3-4)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 010b77548c152a..fb727f9e9b81d4 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -1898,6 +1898,44 @@ static Instruction *foldSelectICmpEq(SelectInst &SI, ICmpInst *ICI,
   return nullptr;
 }
 
+// Turn (select (Cmp X C1) (BOp X C2) C3)
+//   -> (select (Cmp X C1) (BOp X C2) (BOp C1 C2))
+//   -> (BOp (select (Cmp X C1) X C1) C2)
+// iff C3 == BOp C1 C2
+// This allows for better canonicalization.
+static Instruction *canonicalizeConstToBOp(SelectInst &SI, Value *TrueVal,
+                                           Value *FalseVal,
+                                           InstCombinerImpl &IC) {
+  Value *CondVal = SI.getCondition();
+
+  BinaryOperator *BOp;
+  Constant *C1, *C2, *C3;
+  Value *X;
+  ICmpInst::Predicate Predicate;
+
+  if (!match(CondVal, m_c_ICmp(Predicate, m_Value(X), m_Constant(C1))))
+    return nullptr;
+
+  if (!((match(TrueVal, m_BinOp(BOp)) && match(FalseVal, m_Constant(C3))) ||
+        (match(FalseVal, m_BinOp(BOp)) && match(TrueVal, m_Constant(C3)))))
+    return nullptr;
+
+  if (!match(BOp, m_OneUse(m_c_BinOp(m_Specific(X), m_Constant(C2)))))
+    return nullptr;
+
+  if (!(ICmpInst::isRelational(Predicate) &&
+        C3 == ConstantFoldBinaryOpOperands(BOp->getOpcode(), C1, C2,
+                                           SI.getDataLayout())))
+    return nullptr;
+
+  Instruction *NewBO =
+      BinaryOperator::CreateWithCopiedFlags(BOp->getOpcode(), C1, C2, BOp);
+  IC.InsertNewInstBefore(NewBO, SI.getIterator());
+  IC.replaceOperand(SI, isa<Constant>(TrueVal) ? 1 : 2, NewBO);
+  return IC.foldSelectOpOp(SI, cast<Instruction>(SI.getTrueValue()),
+                           cast<Instruction>(SI.getFalseValue()));
+}
+
 /// Visit a SelectInst that has an ICmpInst as its first operand.
 Instruction *InstCombinerImpl::foldSelectInstWithICmp(SelectInst &SI,
                                                       ICmpInst *ICI) {
@@ -1990,6 +2028,9 @@ Instruction *InstCombinerImpl::foldSelectInstWithICmp(SelectInst &SI,
   if (Value *V = foldAbsDiff(ICI, TrueVal, FalseVal, Builder))
     return replaceInstUsesWith(SI, V);
 
+  if (Instruction *I = canonicalizeConstToBOp(SI, TrueVal, FalseVal, *this))
+    return I;
+
   return Changed ? &SI : nullptr;
 }
 
diff --git a/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
new file mode 100644
index 00000000000000..12e54d5a654e0c
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/canonicalize-const-to-bop.ll
@@ -0,0 +1,373 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+define i8 @add_and_sgt(i8 %x) {
+; CHECK-LABEL: define i8 @add_and_sgt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 8)
+; CHECK-NEXT:    [[S:%.*]] = add nuw nsw i8 [[S_V]], 16
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+define i8 @add_sgt_nuw(i8 %x) {
+; CHECK-LABEL: define i8 @add_sgt_nuw(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 8)
+; CHECK-NEXT:    [[S:%.*]] = add nuw i8 [[S_V]], 16
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nuw i8 %x, 16
+  %cmp = icmp sgt i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+define i8 @sub_and_ugt(i8 %x) {
+; CHECK-LABEL: define i8 @sub_and_ugt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = add nsw i8 [[S_V]], -50
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %sub = sub nsw i8 %x, 50
+  %cmp = icmp ugt i8 %x, 100
+  %s = select i1 %cmp, i8 50, i8 %sub
+  ret i8 %s
+}
+
+define i8 @sub_ugt_nuw_nsw(i8 %x) {
+; CHECK-LABEL: define i8 @sub_ugt_nuw_nsw(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = add nsw i8 [[S_V]], -50
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %sub = sub nuw nsw i8 %x, 50
+  %cmp = icmp ugt i8 %x, 100
+  %s = select i1 %cmp, i8 50, i8 %sub
+  ret i8 %s
+}
+
+define i8 @mul_and_ult(i8 %x) {
+; CHECK-LABEL: define i8 @mul_and_ult(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 10)
+; CHECK-NEXT:    [[S:%.*]] = mul nuw nsw i8 [[S_V]], 10
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = mul nsw i8 %x, 10
+  %cmp = icmp ult i8 10, %x
+  %s = select i1 %cmp, i8 100, i8 %add
+  ret i8 %s
+}
+
+define i8 @mul_ult_noflags(i8 %x) {
+; CHECK-LABEL: define i8 @mul_ult_noflags(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.umin.i8(i8 [[X]], i8 10)
+; CHECK-NEXT:    [[S:%.*]] = mul nuw i8 [[S_V]], 10
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = mul i8 %x, 10
+  %cmp = icmp ult i8 10, %x
+  %s = select i1 %cmp, i8 100, i8 %add
+  ret i8 %s
+}
+
+define i8 @udiv_and_slt(i8 %x) {
+; CHECK-LABEL: define i8 @udiv_and_slt(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = udiv i8 [[S_V]], 10
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %sub = udiv i8 %x, 10
+  %cmp = icmp slt i8 %x, 100
+  %s = select i1 %cmp, i8 10, i8 %sub
+  ret i8 %s
+}
+
+define i8 @udiv_slt_exact(i8 %x) {
+; CHECK-LABEL: define i8 @udiv_slt_exact(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 100)
+; CHECK-NEXT:    [[S:%.*]] = udiv exact i8 [[S_V]], 10
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %sub = udiv exact i8 %x, 10
+  %cmp = icmp slt i8 %x, 100
+  %s = select i1 %cmp, i8 10, i8 %sub
+  ret i8 %s
+}
+
+define i8 @canonicalize_icmp_operands(i8 %x) {
+; CHECK-LABEL: define i8 @canonicalize_icmp_operands(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smin.i8(i8 [[X]], i8 119)
+; CHECK-NEXT:    [[S:%.*]] = add nsw i8 [[S_V]], 8
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 8
+  %cmp = icmp sle i8 120, %x
+  %s = select i1 %cmp, i8 127, i8 %add
+  ret i8 %s
+}
+
+declare void @use(i1)
+declare void @use_byte(i8)
+
+define i8 @multi_use_cond_and_sel(i8 %x) {
+; CHECK-LABEL: define i8 @multi_use_cond_and_sel(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 8
+; CHECK-NEXT:    call void @use(i1 [[CMP]])
+; CHECK-NEXT:    [[S_V:%.*]] = call i8 @llvm.smax.i8(i8 [[X]], i8 8)
+; CHECK-NEXT:    [[S:%.*]] = add nuw nsw i8 [[S_V]], 16
+; CHECK-NEXT:    call void @use_byte(i8 [[S]])
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %x, 8
+  call void @use(i1 %cmp)
+  %s = select i1 %cmp, i8 %add, i8 24
+  call void @use_byte(i8 %s)
+  ret i8 %s
+}
+
+define void @rust_noop_loop() {
+; CHECK-LABEL: define void @rust_noop_loop() {
+; CHECK-NEXT:  [[START:.*]]:
+; CHECK-NEXT:    br label %[[BB2_I:.*]]
+; CHECK:       [[BB2_I]]:
+; CHECK-NEXT:    [[ITER_SROA_0_07:%.*]] = phi i32 [ 0, %[[START]] ], [ [[SPEC_SELECT5:%.*]], %[[BB2_I]] ]
+; CHECK-NEXT:    [[_0_I3_I:%.*]] = icmp sgt i32 [[ITER_SROA_0_07]], 99
+; CHECK-NEXT:    [[SPEC_SELECT5_V:%.*]] = call i32 @llvm.smin.i32(i32 [[ITER_SROA_0_07]], i32 99)
+; CHECK-NEXT:    [[SPEC_SELECT5]] = add nsw i32 [[SPEC_SELECT5_V]], 1
+; CHECK-NEXT:    br i1 [[_0_I3_I]], label %[[BASICBLOCK4:.*]], label %[[BB2_I]]
+; CHECK:       [[BASICBLOCK4]]:
+; CHECK-NEXT:    ret void
+;
+start:
+  br label %bb2.i
+
+bb2.i:
+  %iter.sroa.0.07 = phi i32 [ 0, %start ], [ %spec.select5, %bb2.i ]
+  %_0.i3.i = icmp sgt i32 %iter.sroa.0.07, 99
+  %0 = add nsw i32 %iter.sroa.0.07, 1
+  %spec.select5 = select i1 %_0.i3.i, i32 100, i32 %0
+  %_0.i.not.i = icmp sgt i32 %spec.select5, 100
+  %or.cond = select i1 %_0.i3.i, i1 true, i1 %_0.i.not.i
+  br i1 %or.cond, label %basicblock4, label %bb2.i
+
+basicblock4:
+  ret void
+}
+
+define <2 x i8> @add_non_splat_vector(<2 x i8> %x) {
+; CHECK-LABEL: define <2 x i8> @add_non_splat_vector(
+; CHECK-SAME: <2 x i8> [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call <2 x i8> @llvm.smax.v2i8(<2 x i8> [[X]], <2 x i8> <i8 0, i8 1>)
+; CHECK-NEXT:    [[S:%.*]] = add nuw <2 x i8> [[S_V]], <i8 1, i8 0>
+; CHECK-NEXT:    ret <2 x i8> [[S]]
+;
+  %add = add <2 x i8> %x, <i8 1, i8 0>
+  %cmp = icmp sgt <2 x i8> %x, <i8 0, i8 1>
+  %s = select <2 x i1> %cmp, <2 x i8> %add, <2 x i8> <i8 1, i8 1>
+  ret <2 x i8> %s
+}
+
+define <2 x i8> @or_splat_vector(<2 x i8> %x) {
+; CHECK-LABEL: define <2 x i8> @or_splat_vector(
+; CHECK-SAME: <2 x i8> [[X:%.*]]) {
+; CHECK-NEXT:    [[S_V:%.*]] = call <2 x i8> @llvm.smax.v2i8(<2 x i8> [[X]], <2 x i8> <i8 1, i8 1>)
+; CHECK-NEXT:    [[S:%.*]] = or <2 x i8> [[S_V]], <i8 1, i8 1>
+; CHECK-NEXT:    ret <2 x i8> [[S]]
+;
+  %add = or <2 x i8> %x, <i8 1, i8 1>
+  %cmp = icmp sgt <2 x i8> %x, <i8 0, i8 0>
+  %s = select <2 x i1> %cmp, <2 x i8> %add, <2 x i8> <i8 1, i8 1>
+  ret <2 x i8> %s
+}
+
+define i8 @const_operands_dont_fold_negative(i8 %x) {
+; CHECK-LABEL: define i8 @const_operands_dont_fold_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 8
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 [[ADD]], i8 25
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 25
+  ret i8 %s
+}
+
+define i8 @add_with_poison_negative(i8 %x) {
+; CHECK-LABEL: define i8 @add_with_poison_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    ret i8 25
+;
+  %add = add nsw i8 %x, poison
+  %cmp = icmp sgt i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 25
+  ret i8 %s
+}
+
+define i8 @add_with_overflow_negative(i8 %x) {
+; CHECK-LABEL: define i8 @add_with_overflow_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add i8 [[X]], 9
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 119
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 -127, i8 [[ADD]]
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add i8 %x, 9
+  %cmp = icmp sle i8 120, %x
+  %s = select i1 %cmp, i8 -127, i8 %add
+  ret i8 %s
+}
+
+define <2 x i8> @vector_with_poison_negative(<2 x i8> %x) {
+; CHECK-LABEL: define <2 x i8> @vector_with_poison_negative(
+; CHECK-SAME: <2 x i8> [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = xor <2 x i8> [[X]], <i8 1, i8 poison>
+; CHECK-NEXT:    [[CMP_INV:%.*]] = icmp slt <2 x i8> [[X]], <i8 1, i8 1>
+; CHECK-NEXT:    [[S:%.*]] = select <2 x i1> [[CMP_INV]], <2 x i8> <i8 1, i8 1>, <2 x i8> [[ADD]]
+; CHECK-NEXT:    ret <2 x i8> [[S]]
+;
+  %add = xor <2 x i8> %x, <i8 1, i8 poison>
+  %cmp = icmp sgt <2 x i8> %x, <i8 0, i8 0>
+  %s = select <2 x i1> %cmp, <2 x i8> %add, <2 x i8> <i8 1, i8 1>
+  ret <2 x i8> %s
+}
+
+define i8 @multi_use_bop_negative(i8 %x) {
+; CHECK-LABEL: define i8 @multi_use_bop_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    call void @use_byte(i8 [[ADD]])
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 7
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 24, i8 [[ADD]]
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  call void @use_byte(i8 %add)
+  %cmp = icmp sle i8 8, %x
+  %s = select i1 %cmp, i8 24, i8 %add
+  ret i8 %s
+}
+
+define half @float_negative(half %x) {
+; CHECK-LABEL: define half @float_negative(
+; CHECK-SAME: half [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = fmul fast half [[X]], 0xH2E66
+; CHECK-NEXT:    [[CMP:%.*]] = fcmp ugt half [[X]], 0xH5640
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], half 0xH4900, half [[ADD]]
+; CHECK-NEXT:    ret half [[S]]
+;
+  %add = fdiv fast half %x, 10.0
+  %cmp = fcmp ult half 100.0, %x
+  %s = select i1 %cmp, half 10.0, half %add
+  ret half %s
+}
+
+define i8 @poison_false_val_negative(i8 %x) {
+; CHECK-LABEL: define i8 @poison_false_val_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    ret i8 [[ADD]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 poison
+  ret i8 %s
+}
+
+define i8 @eq_negative(i8 %x) {
+; CHECK-LABEL: define i8 @eq_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    ret i8 24
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp eq i8 %x, 8
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+define i8 @different_operands_negative(i8 %x, i8 %y) {
+; CHECK-LABEL: define i8 @different_operands_negative(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i8 [[Y]], 8
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 [[ADD]], i8 24
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp eq i8 %y, 8
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+define i8 @mul_and_non_strict_predicate_negative(i8 %x) {
+; CHECK-LABEL: define i8 @mul_and_non_strict_predicate_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = mul nsw i8 [[X]], 10
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 9
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 100, i8 [[ADD]]
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = mul nsw i8 %x, 10
+  %cmp = icmp sle i8 10, %x
+  %s = select i1 %cmp, i8 100, i8 %add
+  ret i8 %s
+}
+
+define i8 @non_specific_bop_operand_negative(i8 %x, i8 %y) {
+; CHECK-LABEL: define i8 @non_specific_bop_operand_negative(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[Y]], 8
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 [[ADD]], i8 24
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %y, 8
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+define i8 @non_const_cmp_operand_negative(i8 %x, i8 %y) {
+; CHECK-LABEL: define i8 @non_const_cmp_operand_negative(
+; CHECK-SAME: i8 [[X:%.*]], i8 [[Y:%.*]]) {
+; CHECK-NEXT:    [[ADD:%.*]] = add nsw i8 [[X]], 16
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], [[Y]]
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 [[ADD]], i8 24
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %add = add nsw i8 %x, 16
+  %cmp = icmp sgt i8 %x, %y
+  %s = select i1 %cmp, i8 %add, i8 24
+  ret i8 %s
+}
+
+declare i8 @result()
+
+define i8 @non_binop_negative(i8 %x) {
+; CHECK-LABEL: define i8 @non_binop_negative(
+; CHECK-SAME: i8 [[X:%.*]]) {
+; CHECK-NEXT:    [[RESULT:%.*]] = call i8 @result()
+; CHECK-NEXT:    [[CMP:%.*]] = icmp sgt i8 [[X]], 16
+; CHECK-NEXT:    [[S:%.*]] = select i1 [[CMP]], i8 [[RESULT]], i8 24
+; CHECK-NEXT:    ret i8 [[S]]
+;
+  %result = call i8 @result()
+  %cmp = icmp sgt i8 %x, 16
+  %s = select i1 %cmp, i8 %result, i8 24
+  ret i8 %s
+}
diff --git a/llvm/test/Transforms/InstCombine/saturating-add-sub.ll b/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
index 9236d96f59a55b..e050ca762dad46 100644
--- a/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
+++ b/llvm/test/Transforms/InstCombine/saturating-add-sub.ll
@@ -1793,10 +1793,9 @@ define i32 @not_uadd_sat(i32 %x, i32 %y) {
 
 define i32 @not_uadd_sat2(i32 %x, i32 %y) {
 ; CHECK-LABEL: @not_uadd_sat2(
-; CHECK-NEXT:    [[A:%.*]] = add i32 [[X:%.*]], -2
-; CHECK-NEXT:    [[C:%.*]] = icmp ugt i32 [[X]], 1
-; CHECK-NEXT:    [[R:%.*]] = select i1 [[C]], i32 [[A]], i32 -1
-; CHECK-NEXT:    ret i32 [[R]]
+; CHECK-NEXT:    [[X:%.*]] = call i32 @llvm.umax.i32(i32 [[X1:%.*]], i32 1)
+; CHECK-NEXT:    [[A:%.*]] = add i32 [[X]], -2
+; CHECK-NEXT:    ret i32 [[A]]
 ;
   %a = add i32 %x, -2
   %c = icmp ugt i32 %x, 1
diff --git a/llvm/test/Transforms/InstCombine/select.ll b/llvm/test/Transforms/InstCombine/select.ll
index 9e92d227ca447f..51a193bf0578dc 100644
--- a/llvm/test/Transforms/InstCombine/select.ll
+++ b/llvm/test/Transforms/InstCombine/select.ll
@@ -2989,10 +2989,9 @@ define i8 @select_replacement_loop3(i32 noundef %x) {
 
 define i16 @select_replacement_loop4(i16 noundef %p_12) {
 ; CHECK-LABEL: @select_replacement_loop4(
-; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12:%.*]], 1
-; CHECK-NEXT:    [[CMP21:%.*]] = icmp ult i16 [[P_12]], 2
-; CHECK-NEXT:    [[AND3:%.*]] = select i1 [[CMP21]], i16 [[AND1]], i16 0
-; CHECK-NEXT:    ret i16 [[AND3]]
+; CHECK-NEXT:    [[P_12:%.*]] = call i16 @llvm.umin.i16(i16 [[P_13:%.*]], i16 2)
+; CHECK-NEXT:    [[AND1:%.*]] = and i16 [[P_12]], 1
+; CHECK-NEXT:    ret i16 [[AND1]]
 ;
   %cmp1 = icmp ult i16 %p_12, 2
   %and1 = and i16 %p_12, 1

Comment on lines 1919 to 1917
if (!((match(TrueVal, m_BinOp(BOp)) && match(FalseVal, m_Constant(C3))) ||
(match(FalseVal, m_BinOp(BOp)) && match(TrueVal, m_Constant(C3)))))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite convoluted because there's no m_c_Select() yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more idiomatic way to handle this is to have something like if (match(FalseVal, m_Constant())) { std::swap(TrueVal, FalseVal); Pred = ICmpInst::getInversePredicate(Pred); } and then you no longer have to deal with which operand the constant is in.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
; CHECK-NEXT: [[C:%.*]] = icmp ugt i32 [[X]], 1
; CHECK-NEXT: [[R:%.*]] = select i1 [[C]], i32 [[A]], i32 -1
; CHECK-NEXT: ret i32 [[R]]
; CHECK-NEXT: [[X:%.*]] = call i32 @llvm.umax.i32(i32 [[X1:%.*]], i32 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing fold: https://alive2.llvm.org/ce/z/LnDhkk

define i32 @src(i32 %x, i32 %y) {
  %a = add i32 %x, -2
  %c = icmp ult i32 %x, 2
  %r = select i1 %c, i32 -1, i32 %a
  ret i32 %r
}

define i32 @tgt(i32 %x) {
  %a = call i32 @llvm.umax(i32 %x, i32 1)
  %r = add i32 %a, -2
  ret i32 %r
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the feedback.

While this fold is similar to the one in this PR, I don't think this PR can fold the provided example because:

In the example, C1 = 2, C2 = -2, C3 = 1 and BOp = add.

This PR can fold only when C3 == BOp C1 C2 and this does not hold true for the example.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please directly create the final pattern with the min/max intrinsic. Creating temporary, non-canonical instructions and relying on another fold to deal with it is definitely going to bite us later.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
Comment on lines 1919 to 1917
if (!((match(TrueVal, m_BinOp(BOp)) && match(FalseVal, m_Constant(C3))) ||
(match(FalseVal, m_BinOp(BOp)) && match(TrueVal, m_Constant(C3)))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more idiomatic way to handle this is to have something like if (match(FalseVal, m_Constant())) { std::swap(TrueVal, FalseVal); Pred = ICmpInst::getInversePredicate(Pred); } and then you no longer have to deal with which operand the constant is in.

llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp Outdated Show resolved Hide resolved
@llvmbot llvmbot added clang Clang issues not falling into any other category llvm:analysis labels Nov 23, 2024
@veera-sivarajan
Copy link
Author

The more idiomatic way to handle this is to have something like if (match(FalseVal, m_Constant())) { std::swap(TrueVal, FalseVal); Pred = ICmpInst::getInversePredicate(Pred); } and then you no longer have to deal with which operand the constant is in.

Thank you for the detailed feedback.

This simplifies the pattern match but I still have to conditionally assign NewTrue and NewFalse because matchDecomposedSelectPattern() takes an ICmpInst and not a ICmpInst:Predicate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang Clang issues not falling into any other category llvm:analysis llvm:instcombine llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[InstCombine] Missed optimization : fold X > C2 ? X + C1 : C2 + C1 to max(X, C2) + C1
4 participants