[SCEV] Use or disjoint flag #74467

nikic · 2023-12-05T13:28:35Z

Use the disjoint flag to convert or to add instead of calling the haveNoCommonBitsSet() ValueTracking query. This ensures that we can reliably undo add -> or canonicalization, even in cases where the necessary information has been lost or is too complex to reinfer in SCEV.

I have updated the bulk of the test coverage to add the necessary disjoint flags in advance.

llvmbot · 2023-12-05T13:29:06Z

@llvm/pr-subscribers-llvm-analysis

Author: Nikita Popov (nikic)

Changes

Use the disjoint flag to convert or to add instead of calling the haveNoCommonBitsSet() ValueTracking query. This ensures that we can reliably undo add -> or canonicalization, even in cases where the necessary information has been lost or is too complex to reinfer in SCEV.

I have updated the bulk of the test coverage to add the necessary disjoint flags in advance.

Full diff: https://github.com/llvm/llvm-project/pull/74467.diff

2 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+2-5)
(modified) llvm/test/Analysis/ScalarEvolution/add-like-or.ll (+38-4)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 9670e5309b32f..451ae73dbd601 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -5223,11 +5223,8 @@ static std::optional<BinaryOp> MatchBinaryOp(Value *V, const DataLayout &DL,
     return BinaryOp(Op);
 
   case Instruction::Or: {
-    // LLVM loves to convert `add` of operands with no common bits
-    // into an `or`. But SCEV really doesn't deal with `or` that well,
-    // so try extra hard to recognize this `or` as an `add`.
-    if (haveNoCommonBitsSet(Op->getOperand(0), Op->getOperand(1),
-                            SimplifyQuery(DL, &DT, &AC, CxtI)))
+    // Convert or disjoint into add nuw nsw.
+    if (cast<PossiblyDisjointInst>(Op)->isDisjoint())
       return BinaryOp(Instruction::Add, Op->getOperand(0), Op->getOperand(1),
                       /*IsNSW=*/true, /*IsNUW=*/true);
     return BinaryOp(Op);
diff --git a/llvm/test/Analysis/ScalarEvolution/add-like-or.ll b/llvm/test/Analysis/ScalarEvolution/add-like-or.ll
index 38b6c44fdf356..86bb9058832b4 100644
--- a/llvm/test/Analysis/ScalarEvolution/add-like-or.ll
+++ b/llvm/test/Analysis/ScalarEvolution/add-like-or.ll
@@ -6,15 +6,49 @@ define i8 @or-of-constant-with-no-common-bits-set(i8 %x, i8 %y) {
 ; CHECK-NEXT:  Classifying expressions for: @or-of-constant-with-no-common-bits-set
 ; CHECK-NEXT:    %t0 = shl i8 %x, 2
 ; CHECK-NEXT:    --> (4 * %x) U: [0,-3) S: [-128,125)
-; CHECK-NEXT:    %r = or i8 %t0, 3
+; CHECK-NEXT:    %r = or disjoint i8 %t0, 3
 ; CHECK-NEXT:    --> (3 + (4 * %x))<nuw><nsw> U: [3,0) S: [-125,-128)
 ; CHECK-NEXT:  Determining loop execution counts for: @or-of-constant-with-no-common-bits-set
 ;
   %t0 = shl i8 %x, 2
-  %r = or i8 %t0, 3
+  %r = or disjoint i8 %t0, 3
   ret i8 %r
 }
 
+define i8 @or-disjoint(i8 %x, i8 %y) {
+; CHECK-LABEL: 'or-disjoint'
+; CHECK-NEXT:  Classifying expressions for: @or-disjoint
+; CHECK-NEXT:    %or = or disjoint i8 %x, %y
+; CHECK-NEXT:    --> (%x + %y) U: full-set S: full-set
+; CHECK-NEXT:  Determining loop execution counts for: @or-disjoint
+;
+  %or = or disjoint i8 %x, %y
+  ret i8 %or
+}
+
+define i8 @or-no-disjoint(i8 %x, i8 %y) {
+; CHECK-LABEL: 'or-no-disjoint'
+; CHECK-NEXT:  Classifying expressions for: @or-no-disjoint
+; CHECK-NEXT:    %or = or i8 %x, %y
+; CHECK-NEXT:    --> %or U: full-set S: full-set
+; CHECK-NEXT:  Determining loop execution counts for: @or-no-disjoint
+;
+  %or = or i8 %x, %y
+  ret i8 %or
+}
+
+; FIXME: We could add nuw nsw flags here.
+define noundef i8 @or-disjoint-transfer-flags(i8 %x, i8 %y) {
+; CHECK-LABEL: 'or-disjoint-transfer-flags'
+; CHECK-NEXT:  Classifying expressions for: @or-disjoint-transfer-flags
+; CHECK-NEXT:    %or = or disjoint i8 %x, %y
+; CHECK-NEXT:    --> (%x + %y) U: full-set S: full-set
+; CHECK-NEXT:  Determining loop execution counts for: @or-disjoint-transfer-flags
+;
+  %or = or disjoint i8 %x, %y
+  ret i8 %or
+}
+
 define void @mask-high(i64 %arg, ptr dereferenceable(4) %arg1) {
 ; CHECK-LABEL: 'mask-high'
 ; CHECK-NEXT:  Classifying expressions for: @mask-high
@@ -24,7 +58,7 @@ define void @mask-high(i64 %arg, ptr dereferenceable(4) %arg1) {
 ; CHECK-NEXT:    --> (sext i32 %i to i64) U: [-2147483648,2147483648) S: [-2147483648,2147483648)
 ; CHECK-NEXT:    %i3 = and i64 %arg, -16
 ; CHECK-NEXT:    --> (16 * (%arg /u 16))<nuw> U: [0,-15) S: [-9223372036854775808,9223372036854775793)
-; CHECK-NEXT:    %i4 = or i64 1, %i3
+; CHECK-NEXT:    %i4 = or disjoint i64 1, %i3
 ; CHECK-NEXT:    --> (1 + (16 * (%arg /u 16))<nuw>)<nuw><nsw> U: [1,-14) S: [-9223372036854775807,9223372036854775794)
 ; CHECK-NEXT:    %i7 = phi i64 [ %i4, %bb ], [ %i8, %bb6 ]
 ; CHECK-NEXT:    --> {(1 + (16 * (%arg /u 16))<nuw>)<nuw><nsw>,+,1}<%bb6> U: full-set S: full-set Exits: ((sext i32 %i to i64) smax (1 + (16 * (%arg /u 16))<nuw>)<nuw><nsw>) LoopDispositions: { %bb6: Computable }
@@ -42,7 +76,7 @@ bb:
   %i = load i32, ptr %arg1, align 4
   %i2 = sext i32 %i to i64
   %i3 = and i64 %arg, -16
-  %i4 = or i64 1, %i3
+  %i4 = or disjoint i64 1, %i3
   %i5 = icmp sgt i64 %i4, %i2
   br i1 %i5, label %bb10, label %bb6

nikic · 2023-12-05T13:30:39Z

llvm/test/Analysis/ScalarEvolution/add-like-or.ll

I added some tests to make sure we don't incorrectly add the nuw/nsw flags if poison doesn't imply UB, but found that we actually never transfer these. This is because the NSW/NUW flags set in MatchBinaryOp end up being ignored in favor of the ones on the original instruction. I'll fix that in a followup.

Use the disjoint flag to convert or to add instead of calling the haveNoCommonBitsSet() ValueTracking query. This ensures that we can reliably undo add -> or canonicalization, even in cases where the necessary information has been lost or is too complex to reinfer in SCEV. I have updated the bulk of the test coverage to add the necessary disjoint flags in advance.

preames

LGTM

…n OR instructions Prior to patch (llvm#74467), LLVM used the haveNoCommonBitsSet() function to determine whether OR instructions had no common bits. If they didn't, the OR instructions were transformed into ADD instructions, allowing the LoadStoreVectorizer (LSV) to vectorize them. After the patch was applied, the implementation switched to using a disjoint flag to indicate this condition. However, these OR instructions were no longer recognized as disjoint and were not converted into ADD instructions. This oversight led to a regression where LSV failed to vectorize these instructions. This commit addresses the regression by updating LSV to identify if the OR instructions are disjoint, and if so, to transform the IR as follows to continue vectorization: Original IR: %linear_index3 = or i32 %linear_index_plus_base.fr, 3 %linear_index2 = or i32 %linear_index_plus_base.fr, 2 %linear_index1 = or i32 %linear_index_plus_base.fr, 1 Transformed IR: %linear_index3 = or disjoint i32 %linear_index_plus_base.fr, 3 %linear_index2 = or disjoint i32 %linear_index_plus_base.fr, 2 %linear_index1 = or disjoint i32 %linear_index_plus_base.fr, 1 This modification will enable LSV to recognize these instructions and restore vectorization opportunities.

Summary: This commit resolves a regression in the LoadStoreVectorizer (LSV) due to a previous change in how OR instructions were handled. The regression prevented LSV from recognizing and vectorizing OR instructions marked as disjoint. Background: A historical patch ([https://github.com/llvm/llvm-project/pull/74467](https://github.com/llvm/llvm-project/pull/74467)) shifted the identification of disjoint OR instructions from using the haveNoCommonBitsSet() function to a disjoint flag within the instruction. However, this change inadvertently led to missed vectorization opportunities because LSV was no longer converting disjoint OR instructions to ADD instructions, as it did and not recognize the new flag. This patch updates the Scalar Evolution (SCEV) analysis to identify disjoint OR instructions based on the disjoint flag. By doing this, the instructions will be correctly interpreted and the LSV can restore lost vectorization opportunities. Test plan: Added a new phase-ordering test that confirms the regression at -O1, -O2, and -O3 optimization levels, where the original OR instructions were not identified as disjoint and thus not vectorized. The test illustrates that with the new SCEV analysis, the transformed IR will properly have disjoint OR instructions vectorized by LSV. Details: The modified test disjoint_or_vectorizer.ll showcases the restored vectorization with the following transformed IR: Original IR: %linear_index3 = or i32 %linear_index_plus_base.fr, 3 %linear_index2 = or i32 %linear_index_plus_base, 2 %linear_index1 = or i32 %linear_index_plus_base, 1 Transformed IR with disjoint flag: %linear_index3 = or disjoint i32 %linear_index_plus_base, 3 %linear_index2 = or disjoint i32 %linear_index_plus_base, 2 %linear_index1 = or disjoint i32 %linear_index_plus_base, 1 With this change, we ensure the SCEV analysis can correctly handle and optimize disjoint OR instructions. Signed-off-by: lihao217 <[email protected]>

nikic requested review from preames and topperc December 5, 2023 13:28

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Dec 5, 2023

nikic commented Dec 5, 2023

View reviewed changes

nikic force-pushed the scev-or-disjoint branch from cd90987 to 415236a Compare December 5, 2023 13:37

nikic mentioned this pull request Dec 5, 2023

[ValueTracking] Add dominating condition support in computeKnownBits() #73662

Merged

preames approved these changes Dec 5, 2023

View reviewed changes

nikic merged commit ff0e4fb into llvm:main Dec 5, 2023

krzysz00 mentioned this pull request Apr 4, 2024

"Interference" assection in SplitKit - bisected to a SCEV change and isolated to AMDGPU division expansion #87721

Closed

This was referenced Jun 24, 2024

[LSV] Enhance LoadStoreVectorizer to Handle Disjoint Flag in OR Instructions and Restore Vectorization Opportunities #96495

Open

[LSV] Enhance LoadStoreVectorizer to Handle Disjoint Flag in OR Instructions and Restore Vectorization Opportunities #96502

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SCEV] Use or disjoint flag #74467

[SCEV] Use or disjoint flag #74467

Uh oh!

nikic commented Dec 5, 2023

Uh oh!

llvmbot commented Dec 5, 2023

Uh oh!

nikic Dec 5, 2023

Uh oh!

preames left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[SCEV] Use or disjoint flag #74467

[SCEV] Use or disjoint flag #74467

Uh oh!

Conversation

nikic commented Dec 5, 2023

Uh oh!

llvmbot commented Dec 5, 2023

Uh oh!

nikic Dec 5, 2023

Choose a reason for hiding this comment

Uh oh!

preames left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants