Skip to content

[HashRecognize] Rewrite arePHIsIntertwined #144878

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 2, 2025

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Jun 19, 2025

The test crc8.le.tc16 is a valid CRC algorithm, but isn't recognized as such due to a buggy arePHIsIntertwined, which is asymmetric in its PHINode arguments. There is also a fundamentally correctness issue: the core functionality is to match a XOR that's a recurrence in both PHI nodes, ignoring casts, but the user of the XOR is never checked. Rewrite and rename the function.

crc8.le.tc16 is still not recognized as a valid CRC algorithm, due to an incorrect check for loop iterations exceeding the bitwidth of the result: in reality, it should not exceed the bitwidth of LHSAux, but we leave this fix to a follow-up.

@artagnon artagnon requested review from nikic and pfusik June 19, 2025 11:29
@llvmbot llvmbot added llvm:ir llvm:analysis Includes value tracking, cost tables and constant folding labels Jun 19, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 19, 2025

@llvm/pr-subscribers-llvm-ir

Author: Ramkumar Ramachandra (artagnon)

Changes

The test crc8.le.tc16 is a valid CRC algorithm, but isn't recognized as such due to a buggy arePHIsIntertwined, which is asymmetric in its PHINode arguments. Another issue is that arePHIsIntertwined is unnecessarily complicated: the core functionality is to match a BinOp that's a recurrence in both PHI nodes, ignoring zext and trunc casts. Hence, rewrite it for clarity, and make it symmetric in both PHINode arguments.

crc8.le.tc16 is still not recognized as a valid CRC algorithm, due to an incorrect check for loop iterations exceeding the bitwidth of the result: in reality, it should not exceed the bitwidth of LHSAux, but we leave this fix to a follow-up.


Full diff: https://github.com/llvm/llvm-project/pull/144878.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/PatternMatch.h (+7)
  • (modified) llvm/lib/Analysis/HashRecognize.cpp (+29-37)
  • (modified) llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll (+60-6)
diff --git a/llvm/include/llvm/IR/PatternMatch.h b/llvm/include/llvm/IR/PatternMatch.h
index 2eaa7d0faabc1..2ff24a8956326 100644
--- a/llvm/include/llvm/IR/PatternMatch.h
+++ b/llvm/include/llvm/IR/PatternMatch.h
@@ -2160,6 +2160,13 @@ m_ZExtOrSelf(const OpTy &Op) {
   return m_CombineOr(m_ZExt(Op), Op);
 }
 
+template <typename OpTy>
+inline match_combine_or<match_combine_or<CastInst_match<OpTy, ZExtInst>, OpTy>,
+                        CastInst_match<OpTy, TruncInst>>
+m_ZExtOrTruncOrSelf(const OpTy &Op) {
+  return m_CombineOr(m_ZExtOrSelf(Op), m_Trunc(Op));
+}
+
 template <typename OpTy>
 inline match_combine_or<CastInst_match<OpTy, SExtInst>, OpTy>
 m_SExtOrSelf(const OpTy &Op) {
diff --git a/llvm/lib/Analysis/HashRecognize.cpp b/llvm/lib/Analysis/HashRecognize.cpp
index d11602f921872..ece53e03239d2 100644
--- a/llvm/lib/Analysis/HashRecognize.cpp
+++ b/llvm/lib/Analysis/HashRecognize.cpp
@@ -497,45 +497,37 @@ CRCTable HashRecognize::genSarwateTable(const APInt &GenPoly,
   return Table;
 }
 
-/// Checks if \p Reference is reachable from \p Needle on the use-def chain, and
-/// that there are no stray PHI nodes while digging the use-def chain. \p
-/// BOToMatch is a CRC peculiarity: at least one of the Users of Needle needs to
-/// match this OpCode, which is XOR for CRC.
+/// Checks that \p Needle and \p Reference are used together in a
+/// BinaryOperator, which is a recurrence over both, ignoring zext and trunc.
+/// Additionally checks that the BinaryOperator has opcode \p BOToMatch, which
+/// is XOR in the case of CRC. In other words, it checks for the following
+/// pattern:
+///
+/// loop:
+///   %needle = phi [_, %entry], [%needle.next, %loop]
+///   %reference = phi [_, %entry], [%reference.next, %loop]
+///   ...
+///   _ = BOTOMatch ((trunc|zext|self) %needle) ((trunc|zext|self) %reference)
+///
 static bool arePHIsIntertwined(
     const PHINode *Needle, const PHINode *Reference, const Loop &L,
     Instruction::BinaryOps BOToMatch = Instruction::BinaryOpsEnd) {
-  // Initialize the worklist with Users of the Needle.
-  SmallVector<const Instruction *> Worklist;
-  for (const User *U : Needle->users()) {
-    if (auto *UI = dyn_cast<Instruction>(U))
-      if (L.contains(UI))
-        Worklist.push_back(UI);
-  }
-
-  // BOToMatch is usually XOR for CRC.
-  if (BOToMatch != Instruction::BinaryOpsEnd) {
-    if (count_if(Worklist, [BOToMatch](const Instruction *I) {
-          return I->getOpcode() == BOToMatch;
-        }) != 1)
-      return false;
-  }
-
-  while (!Worklist.empty()) {
-    const Instruction *I = Worklist.pop_back_val();
-
-    // Since Needle is never pushed onto the Worklist, I must either be the
-    // Reference PHI node (in which case we're done), or a stray PHI node (in
-    // which case we abort).
-    if (isa<PHINode>(I))
-      return I == Reference;
-
-    for (const Use &U : I->operands())
-      if (auto *UI = dyn_cast<Instruction>(U))
-        // Don't push Needle back onto the Worklist.
-        if (UI != Needle && L.contains(UI))
-          Worklist.push_back(UI);
-  }
-  return false;
+  return any_of(ArrayRef({Needle, Reference}), [&](const PHINode *S) {
+    return count_if(S->users(), [&](const User *U) {
+             auto *BO = dyn_cast<BinaryOperator>(U);
+             if (!BO)
+               return false;
+             if (BOToMatch != Instruction::BinaryOpsEnd &&
+                 BO->getOpcode() != BOToMatch)
+               return false;
+             return all_of(BO->operands(), [Needle, Reference](const Use &U) {
+               return match(
+                   U.get(),
+                   m_CombineOr(m_ZExtOrTruncOrSelf(m_Specific(Reference)),
+                               m_ZExtOrTruncOrSelf(m_Specific(Needle))));
+             });
+           }) == 1;
+  });
 }
 
 // Recognizes a multiplication or division by the constant two, using SCEV. By
@@ -588,7 +580,7 @@ HashRecognize::recognizeCRC() const {
       return "Loop with non-unit bitshifts";
     if (!arePHIsIntertwined(SimpleRecurrence.Phi, ConditionalRecurrence.Phi, L,
                             Instruction::BinaryOps::Xor))
-      return "Simple recurrence doesn't use conditional recurrence with XOR";
+      return "Recurrences not intertwined with XOR";
   }
 
   // Make sure that the computed value is used in the exit block: this should be
diff --git a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
index 0366684a13b54..c42985037d83a 100644
--- a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
+++ b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
@@ -144,6 +144,34 @@ exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
 
+define i8 @crc8.le.tc16(i16 %msg, i8 %checksum) {
+; CHECK-LABEL: 'crc8.le.tc16'
+; CHECK-NEXT:  Did not find a hash algorithm
+; CHECK-NEXT:  Reason: Loop iterations exceed bitwidth of result
+;
+entry:
+  br label %loop
+
+loop:                                              ; preds = %loop, %entry
+  %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
+  %crc = phi i8 [ %checksum, %entry ], [ %crc.next, %loop ]
+  %data = phi i16 [ %msg, %entry ], [ %data.next, %loop ]
+  %data.trunc = trunc i16 %data to i8
+  %xor.crc.data = xor i8 %crc, %data.trunc
+  %and.crc.data = and i8 %xor.crc.data, 1
+  %data.next = lshr i16 %data, 1
+  %check.sb = icmp eq i8 %and.crc.data, 0
+  %crc.lshr = lshr i8 %crc, 1
+  %crc.xor = xor i8 %crc.lshr, 29
+  %crc.next = select i1 %check.sb, i8 %crc.lshr, i8 %crc.xor
+  %iv.next = add nuw nsw i8 %iv, 1
+  %exit.cond = icmp samesign ult i8 %iv, 15
+  br i1 %exit.cond, label %loop, label %exit
+
+exit:                                              ; preds = %loop
+  ret i8 %crc.next
+}
+
 define i16 @crc16.be.tc8.crc.init.li(i16 %checksum, i8 %msg) {
 ; CHECK-LABEL: 'crc16.be.tc8.crc.init.li'
 ; CHECK-NEXT:  Found big-endian CRC-16 loop with trip count 8
@@ -601,7 +629,7 @@ exit:                                              ; preds = %loop
 define i16 @not.crc.wrong.sb.check.const(i8 %msg, i16 %checksum) {
 ; CHECK-LABEL: 'not.crc.wrong.sb.check.const'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Bad RHS of significant-bit-check
 ;
 entry:
   br label %loop
@@ -610,9 +638,8 @@ loop:                                              ; preds = %loop, %entry
   %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
   %data = phi i8 [ %msg, %entry ], [ %data.next, %loop ]
   %crc = phi i16 [ %checksum, %entry ], [ %crc.next, %loop ]
-  %crc.lshr = lshr i16 %crc, 8
   %data.ext = zext i8 %data to i16
-  %xor.crc.data = xor i16 %crc.lshr, %data.ext
+  %xor.crc.data = xor i16 %crc, %data.ext
   %check.sb = icmp samesign ult i16 %xor.crc.data, 128
   %crc.shl = shl i16 %crc, 1
   %crc.xor = xor i16 %crc.shl, 258
@@ -838,10 +865,37 @@ exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
 
+define i16 @not.crc.bad.cast(i8 %msg, i16 %checksum) {
+; CHECK-LABEL: 'not.crc.bad.cast'
+; CHECK-NEXT:  Did not find a hash algorithm
+; CHECK-NEXT:  Reason: Expected bottom 8 bits zero (????????00001011)
+;
+entry:
+  br label %loop
+
+loop:                                              ; preds = %loop, %entry
+  %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
+  %data = phi i8 [ %msg, %entry ], [ %data.next, %loop ]
+  %crc = phi i16 [ %checksum, %entry ], [ %crc.next, %loop ]
+  %data.ext = zext i8 %data to i16
+  %xor.crc.data = xor i16 %crc, %data.ext
+  %check.sb = icmp slt i16 %xor.crc.data, 0
+  %crc.shl = shl i16 %crc, 1
+  %crc.xor = xor i16 %crc.shl, 29
+  %crc.next = select i1 %check.sb, i16 %crc.shl, i16 %crc.xor
+  %data.next = shl i8 %data, 1
+  %iv.next = add nuw nsw i8 %iv, 1
+  %exit.cond = icmp samesign ult i8 %iv, 7
+  br i1 %exit.cond, label %loop, label %exit
+
+exit:                                              ; preds = %loop
+  ret i16 %crc.next
+}
+
 define i32 @not.crc.dead.msg.bad.use(i32 %checksum, i32 %msg) {
 ; CHECK-LABEL: 'not.crc.dead.msg.bad.use'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop
@@ -869,7 +923,7 @@ exit:                                              ; preds = %loop
 define i16 @not.crc.dead.msg.no.use(i8 %msg, i16 %checksum) {
 ; CHECK-LABEL: 'not.crc.dead.msg.no.use'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop
@@ -898,7 +952,7 @@ exit:                                              ; preds = %loop
 define i32 @not.crc.dead.msg.wrong.op(i32 %checksum, i32 %msg) {
 ; CHECK-LABEL: 'not.crc.dead.msg.wrong.op'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop

@llvmbot
Copy link
Member

llvmbot commented Jun 19, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Ramkumar Ramachandra (artagnon)

Changes

The test crc8.le.tc16 is a valid CRC algorithm, but isn't recognized as such due to a buggy arePHIsIntertwined, which is asymmetric in its PHINode arguments. Another issue is that arePHIsIntertwined is unnecessarily complicated: the core functionality is to match a BinOp that's a recurrence in both PHI nodes, ignoring zext and trunc casts. Hence, rewrite it for clarity, and make it symmetric in both PHINode arguments.

crc8.le.tc16 is still not recognized as a valid CRC algorithm, due to an incorrect check for loop iterations exceeding the bitwidth of the result: in reality, it should not exceed the bitwidth of LHSAux, but we leave this fix to a follow-up.


Full diff: https://github.com/llvm/llvm-project/pull/144878.diff

3 Files Affected:

  • (modified) llvm/include/llvm/IR/PatternMatch.h (+7)
  • (modified) llvm/lib/Analysis/HashRecognize.cpp (+29-37)
  • (modified) llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll (+60-6)
diff --git a/llvm/include/llvm/IR/PatternMatch.h b/llvm/include/llvm/IR/PatternMatch.h
index 2eaa7d0faabc1..2ff24a8956326 100644
--- a/llvm/include/llvm/IR/PatternMatch.h
+++ b/llvm/include/llvm/IR/PatternMatch.h
@@ -2160,6 +2160,13 @@ m_ZExtOrSelf(const OpTy &Op) {
   return m_CombineOr(m_ZExt(Op), Op);
 }
 
+template <typename OpTy>
+inline match_combine_or<match_combine_or<CastInst_match<OpTy, ZExtInst>, OpTy>,
+                        CastInst_match<OpTy, TruncInst>>
+m_ZExtOrTruncOrSelf(const OpTy &Op) {
+  return m_CombineOr(m_ZExtOrSelf(Op), m_Trunc(Op));
+}
+
 template <typename OpTy>
 inline match_combine_or<CastInst_match<OpTy, SExtInst>, OpTy>
 m_SExtOrSelf(const OpTy &Op) {
diff --git a/llvm/lib/Analysis/HashRecognize.cpp b/llvm/lib/Analysis/HashRecognize.cpp
index d11602f921872..ece53e03239d2 100644
--- a/llvm/lib/Analysis/HashRecognize.cpp
+++ b/llvm/lib/Analysis/HashRecognize.cpp
@@ -497,45 +497,37 @@ CRCTable HashRecognize::genSarwateTable(const APInt &GenPoly,
   return Table;
 }
 
-/// Checks if \p Reference is reachable from \p Needle on the use-def chain, and
-/// that there are no stray PHI nodes while digging the use-def chain. \p
-/// BOToMatch is a CRC peculiarity: at least one of the Users of Needle needs to
-/// match this OpCode, which is XOR for CRC.
+/// Checks that \p Needle and \p Reference are used together in a
+/// BinaryOperator, which is a recurrence over both, ignoring zext and trunc.
+/// Additionally checks that the BinaryOperator has opcode \p BOToMatch, which
+/// is XOR in the case of CRC. In other words, it checks for the following
+/// pattern:
+///
+/// loop:
+///   %needle = phi [_, %entry], [%needle.next, %loop]
+///   %reference = phi [_, %entry], [%reference.next, %loop]
+///   ...
+///   _ = BOTOMatch ((trunc|zext|self) %needle) ((trunc|zext|self) %reference)
+///
 static bool arePHIsIntertwined(
     const PHINode *Needle, const PHINode *Reference, const Loop &L,
     Instruction::BinaryOps BOToMatch = Instruction::BinaryOpsEnd) {
-  // Initialize the worklist with Users of the Needle.
-  SmallVector<const Instruction *> Worklist;
-  for (const User *U : Needle->users()) {
-    if (auto *UI = dyn_cast<Instruction>(U))
-      if (L.contains(UI))
-        Worklist.push_back(UI);
-  }
-
-  // BOToMatch is usually XOR for CRC.
-  if (BOToMatch != Instruction::BinaryOpsEnd) {
-    if (count_if(Worklist, [BOToMatch](const Instruction *I) {
-          return I->getOpcode() == BOToMatch;
-        }) != 1)
-      return false;
-  }
-
-  while (!Worklist.empty()) {
-    const Instruction *I = Worklist.pop_back_val();
-
-    // Since Needle is never pushed onto the Worklist, I must either be the
-    // Reference PHI node (in which case we're done), or a stray PHI node (in
-    // which case we abort).
-    if (isa<PHINode>(I))
-      return I == Reference;
-
-    for (const Use &U : I->operands())
-      if (auto *UI = dyn_cast<Instruction>(U))
-        // Don't push Needle back onto the Worklist.
-        if (UI != Needle && L.contains(UI))
-          Worklist.push_back(UI);
-  }
-  return false;
+  return any_of(ArrayRef({Needle, Reference}), [&](const PHINode *S) {
+    return count_if(S->users(), [&](const User *U) {
+             auto *BO = dyn_cast<BinaryOperator>(U);
+             if (!BO)
+               return false;
+             if (BOToMatch != Instruction::BinaryOpsEnd &&
+                 BO->getOpcode() != BOToMatch)
+               return false;
+             return all_of(BO->operands(), [Needle, Reference](const Use &U) {
+               return match(
+                   U.get(),
+                   m_CombineOr(m_ZExtOrTruncOrSelf(m_Specific(Reference)),
+                               m_ZExtOrTruncOrSelf(m_Specific(Needle))));
+             });
+           }) == 1;
+  });
 }
 
 // Recognizes a multiplication or division by the constant two, using SCEV. By
@@ -588,7 +580,7 @@ HashRecognize::recognizeCRC() const {
       return "Loop with non-unit bitshifts";
     if (!arePHIsIntertwined(SimpleRecurrence.Phi, ConditionalRecurrence.Phi, L,
                             Instruction::BinaryOps::Xor))
-      return "Simple recurrence doesn't use conditional recurrence with XOR";
+      return "Recurrences not intertwined with XOR";
   }
 
   // Make sure that the computed value is used in the exit block: this should be
diff --git a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
index 0366684a13b54..c42985037d83a 100644
--- a/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
+++ b/llvm/test/Analysis/HashRecognize/cyclic-redundancy-check.ll
@@ -144,6 +144,34 @@ exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
 
+define i8 @crc8.le.tc16(i16 %msg, i8 %checksum) {
+; CHECK-LABEL: 'crc8.le.tc16'
+; CHECK-NEXT:  Did not find a hash algorithm
+; CHECK-NEXT:  Reason: Loop iterations exceed bitwidth of result
+;
+entry:
+  br label %loop
+
+loop:                                              ; preds = %loop, %entry
+  %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
+  %crc = phi i8 [ %checksum, %entry ], [ %crc.next, %loop ]
+  %data = phi i16 [ %msg, %entry ], [ %data.next, %loop ]
+  %data.trunc = trunc i16 %data to i8
+  %xor.crc.data = xor i8 %crc, %data.trunc
+  %and.crc.data = and i8 %xor.crc.data, 1
+  %data.next = lshr i16 %data, 1
+  %check.sb = icmp eq i8 %and.crc.data, 0
+  %crc.lshr = lshr i8 %crc, 1
+  %crc.xor = xor i8 %crc.lshr, 29
+  %crc.next = select i1 %check.sb, i8 %crc.lshr, i8 %crc.xor
+  %iv.next = add nuw nsw i8 %iv, 1
+  %exit.cond = icmp samesign ult i8 %iv, 15
+  br i1 %exit.cond, label %loop, label %exit
+
+exit:                                              ; preds = %loop
+  ret i8 %crc.next
+}
+
 define i16 @crc16.be.tc8.crc.init.li(i16 %checksum, i8 %msg) {
 ; CHECK-LABEL: 'crc16.be.tc8.crc.init.li'
 ; CHECK-NEXT:  Found big-endian CRC-16 loop with trip count 8
@@ -601,7 +629,7 @@ exit:                                              ; preds = %loop
 define i16 @not.crc.wrong.sb.check.const(i8 %msg, i16 %checksum) {
 ; CHECK-LABEL: 'not.crc.wrong.sb.check.const'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Bad RHS of significant-bit-check
 ;
 entry:
   br label %loop
@@ -610,9 +638,8 @@ loop:                                              ; preds = %loop, %entry
   %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
   %data = phi i8 [ %msg, %entry ], [ %data.next, %loop ]
   %crc = phi i16 [ %checksum, %entry ], [ %crc.next, %loop ]
-  %crc.lshr = lshr i16 %crc, 8
   %data.ext = zext i8 %data to i16
-  %xor.crc.data = xor i16 %crc.lshr, %data.ext
+  %xor.crc.data = xor i16 %crc, %data.ext
   %check.sb = icmp samesign ult i16 %xor.crc.data, 128
   %crc.shl = shl i16 %crc, 1
   %crc.xor = xor i16 %crc.shl, 258
@@ -838,10 +865,37 @@ exit:                                              ; preds = %loop
   ret i16 %crc.next
 }
 
+define i16 @not.crc.bad.cast(i8 %msg, i16 %checksum) {
+; CHECK-LABEL: 'not.crc.bad.cast'
+; CHECK-NEXT:  Did not find a hash algorithm
+; CHECK-NEXT:  Reason: Expected bottom 8 bits zero (????????00001011)
+;
+entry:
+  br label %loop
+
+loop:                                              ; preds = %loop, %entry
+  %iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
+  %data = phi i8 [ %msg, %entry ], [ %data.next, %loop ]
+  %crc = phi i16 [ %checksum, %entry ], [ %crc.next, %loop ]
+  %data.ext = zext i8 %data to i16
+  %xor.crc.data = xor i16 %crc, %data.ext
+  %check.sb = icmp slt i16 %xor.crc.data, 0
+  %crc.shl = shl i16 %crc, 1
+  %crc.xor = xor i16 %crc.shl, 29
+  %crc.next = select i1 %check.sb, i16 %crc.shl, i16 %crc.xor
+  %data.next = shl i8 %data, 1
+  %iv.next = add nuw nsw i8 %iv, 1
+  %exit.cond = icmp samesign ult i8 %iv, 7
+  br i1 %exit.cond, label %loop, label %exit
+
+exit:                                              ; preds = %loop
+  ret i16 %crc.next
+}
+
 define i32 @not.crc.dead.msg.bad.use(i32 %checksum, i32 %msg) {
 ; CHECK-LABEL: 'not.crc.dead.msg.bad.use'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop
@@ -869,7 +923,7 @@ exit:                                              ; preds = %loop
 define i16 @not.crc.dead.msg.no.use(i8 %msg, i16 %checksum) {
 ; CHECK-LABEL: 'not.crc.dead.msg.no.use'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop
@@ -898,7 +952,7 @@ exit:                                              ; preds = %loop
 define i32 @not.crc.dead.msg.wrong.op(i32 %checksum, i32 %msg) {
 ; CHECK-LABEL: 'not.crc.dead.msg.wrong.op'
 ; CHECK-NEXT:  Did not find a hash algorithm
-; CHECK-NEXT:  Reason: Simple recurrence doesn't use conditional recurrence with XOR
+; CHECK-NEXT:  Reason: Recurrences not intertwined with XOR
 ;
 entry:
   br label %loop

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This checks if Needle and Reference are used in a BinaryOperator, one directly, one indirectly: through zext or trunc.
What's the point if we don't check what the BinaryOperator is later used for?
There's also no check for trunc possibly stripping significant bits.

Also, what's the use case for BoToMatch == Instruction::BinaryOpsEnd ?

@artagnon
Copy link
Contributor Author

This checks if Needle and Reference are used in a BinaryOperator, one directly, one indirectly: through zext or trunc. What's the point if we don't check what the BinaryOperator is later used for?

Our real check for whether or not the algorithm is a CRC-loop idiom is the KnownBits propagation. This check is mainly in place so that we don't report a faulty LHSAux: see tests not.crc.dead.msg.bad.use and not.crc.dead.msg.no.use.

There's also no check for trunc possibly stripping significant bits.

The check isn't a correctness check: there is really no way to ensure that the zext/trunc are correct at this point, and we instead rely on the KnownBits propagation for correctness, keeping a weak check at this point.

Also, what's the use case for BoToMatch == Instruction::BinaryOpsEnd ?

In retrospect, this generality is unnecessary: will strip, thanks.

@artagnon artagnon force-pushed the hr-phisintertwined branch from 201d906 to 55b69f9 Compare June 23, 2025 11:16
artagnon added 2 commits June 30, 2025 16:11
The test crc8.le.tc16 is a valid CRC algorithm, but isn't recognized as
such due to a buggy arePHIsIntertwined, which is asymmetric in its PHI
node arguments. Another issue is that arePHIsIntertwined is
unnecessarily complicated: the core functionality is to match a BinOp
that's a recurrence in both PHI nodes, ignoring zext and trunc casts.
Hence, rewrite it for clarity, and make it symmetric in both PHI
operands.

crc8.le.tc16 is still not recognized as a valid CRC algorithm, due to an
incorrect check for loop iterations exceeding the bitwidth of the
result: in reality, it should not exceed the bitwidth of LHSAux, but we
leave this fix to a follow-up.
@artagnon artagnon force-pushed the hr-phisintertwined branch from 11bebd7 to 22cf537 Compare June 30, 2025 15:11
@artagnon
Copy link
Contributor Author

Gentle ping.

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for my late response. Only today I had more time to analyze the following doubt.

This checks if Needle and Reference are used in a BinaryOperator, one directly, one indirectly: through zext or trunc. What's the point if we don't check what the BinaryOperator is later used for?

Our real check for whether or not the algorithm is a CRC-loop idiom is the KnownBits propagation. This check is mainly in place so that we don't report a faulty LHSAux: see tests not.crc.dead.msg.bad.use and not.crc.dead.msg.no.use.

Checking that the two phis are somehow xored, but not checking what the xor is used for might catch some negative cases, but is not useful in general. What if the xor result is only written to stdout or an array? I believe we should check that the xor is in the use-def chain of ConditionalRecurrence.Step's condition.

@artagnon
Copy link
Contributor Author

Checking that the two phis are somehow xored, but not checking what the xor is used for might catch some negative cases, but is not useful in general. What if the xor result is only written to stdout or an array? I believe we should check that the xor is in the use-def chain of ConditionalRecurrence.Step's condition.

Thanks a lot for thinking deeply about this! There was indeed a huge hole in the previous implementation: I've added a pre-commit test, and completely redone the patch. This is a huge improvement!

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, just some small nits.

@pfusik
Copy link
Contributor

pfusik commented Jul 1, 2025

Now I think the XOR might be e.g. multiplied by a (runtime) zero, or be one of the branches of a SelectInst that is never taken.

@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

Now I think the XOR might be e.g. multiplied by a (runtime) zero

I've added a test showing that the KnownBits propagation works.

or be one of the branches of a SelectInst that is never taken.

The XOR is checked in the condition of the select, not the branches.

@pfusik
Copy link
Contributor

pfusik commented Jul 1, 2025

Please also change L->P1, R->P2 in the IR comment for clarity.

Now I think the XOR might be e.g. multiplied by a (runtime) zero

I've added a test showing that the KnownBits propagation works.

If KnownBits propagation is sufficient, do we even need to look for a XOR ?

or be one of the branches of a SelectInst that is never taken.

The XOR is checked in the condition of the select, not the branches.

I meant another SelectInst in the use-def chain of the condition.

Let's put it another way: which test fails if we remove the isConditionalOnXorOfPHIs call?

@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

Please also change L->P1, R->P2 in the IR comment for clarity.

Oops, sorry about that. Fixed now.

The XOR is checked in the condition of the select, not the branches.
I meant another SelectInst in the use-def chain of the condition.

I think the KnownBits propagation would be sufficient, similar to the test I just added.

Let's put it another way: which test fails if we remove the isConditionalOnXorOfPHIs call?

So without it, not.crc.dead.msg.no.use and not.crc.dead.msg.notin.select.chain would pass and report a faulty LHSAux. We could get it not to report the faulty LHSAux instead of failing completely, but it would complicate life for the optimization, which can't just replace the entire loop.

So, in a crc computation, the initial value of the crc is usually passed along with the message to which to compute the crc for. Sometimes, the message is XOR'ed with the initial value of the crc outside the loop, or before calling the function (see init.li and init.arg tests): in this case, the crc init value evolving in the loop includes the message. The purpose of isConditionalOnXorOfPHIs is to make sure that we don't incorrectly assume a loop-variant message when the message is not XOR'ed with crc in the loop. If we report the LHSAux or message, the optimization will be incorrect, and if we report that LHSAux isn't used in the crc computation, it will complicate life for the optimization, because it would still need to preserve the loop-variant computation of the unrelated value along with optimizing the crc computation.

@pfusik
Copy link
Contributor

pfusik commented Jul 1, 2025

Thank you for your last paragraph.

My concern is that we could miscompile this specially crafted code:

  %evil.xor = xor i8 %data, %crc.trunc
  %zero = some value that is known to be zero
  %evil.mul = mul i8, %evil.xor, %zero
  %xor.data.crc = xor i8 %evil.mul, %crc.trunc
  %and.data.crc = and i8 %xor.data.crc, 1
  %data.next = lshr i8 %data, 1
  %check.sb = icmp eq i8 %and.data.crc, 0
  %crc.lshr = lshr i16 %crc, 1
  %xor = xor i16 %crc.lshr, -24575
  %crc.next = select i1 %check.sb, i16 %crc.lshr, i16 %xor

isConditionalOnXorOfPHIs will match %evil.xor, CRC computation appears correct because of %xor.data.crc, the loop doesn't use %data for CRC computation yet we incorrectly report LHSAux.

@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

My concern is that we could miscompile this specially crafted code:

Thank you for finding yet another bug! Fixed now by matching the first XOR (I think I had equivalent logic before the rewrite using count_if == 1, but missed it in the rewrite).

Note that if the corrupt xor is anything else, the KnownBits propagation will fail.

@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

My concern is that we could miscompile this specially crafted code:

Thank you for finding yet another bug! Fixed now by matching the first XOR (I think I had equivalent logic before the rewrite using count_if == 1, but missed it in the rewrite).

Note that if the corrupt xor is anything else, the KnownBits propagation will fail.

Wait a second, it could be 'or'. Need to think some more about this.

@artagnon artagnon force-pushed the hr-phisintertwined branch from 07b6276 to 40b1041 Compare July 1, 2025 17:25
@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

My concern is that we could miscompile this specially crafted code:

Thank you for finding yet another bug! Fixed now by matching the first XOR (I think I had equivalent logic before the rewrite using count_if == 1, but missed it in the rewrite).
Note that if the corrupt xor is anything else, the KnownBits propagation will fail.

Wait a second, it could be 'or'. Need to think some more about this.

I believe that I have the right fix now, with a known-bits computation.

@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

My concern is that we could miscompile this specially crafted code:

Thank you for finding yet another bug! Fixed now by matching the first XOR (I think I had equivalent logic before the rewrite using count_if == 1, but missed it in the rewrite).
Note that if the corrupt xor is anything else, the KnownBits propagation will fail.

Wait a second, it could be 'or'. Need to think some more about this.

I believe that I have the right fix now, with a known-bits computation.

This is not a fool-proof solution, as we could perform computations eventually leading to zero, but that exceed the compute-known-bits depth. Have to think some more.

@artagnon artagnon force-pushed the hr-phisintertwined branch from 40b1041 to a41485a Compare July 1, 2025 19:01
@artagnon
Copy link
Contributor Author

artagnon commented Jul 1, 2025

Okay, I found a much simpler solution. Hopefully, this is ready now.

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@artagnon artagnon merged commit cbfd0d6 into llvm:main Jul 2, 2025
7 checks passed
@artagnon artagnon deleted the hr-phisintertwined branch July 2, 2025 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding llvm:ir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants