[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226

atrosinenko · 2025-07-23T14:01:50Z

GVN PRE, SimplifyCFG and possibly other passes may hoist or sink the call to @llvm.ptrauth.blend intrinsic, introducing multiple duplicate call instructions hidden behind a PHI node. This prevents the instruction selector from generating safer code by absorbing the address and immediate modifiers into separate operands of AUT, PAC, etc. pseudo instruction.

This patch makes CodeGenPrepare pass detect when discriminator is computed as a PHI node with all incoming values being blends with the same immediate modifier. Each such discriminator value is replaced by a single blend, whose address argument is computed by a PHI node.

atrosinenko · 2025-07-23T15:03:15Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226 👈 (View in Graphite)
[AArch64][PAC] Rework discriminator analysis for calls and tail calls #147136 : 1 other dependent PR (#147815 )
[AArch64][PAC] Rework discriminator analysis in AUT and AUTPAC #146489
[AArch64][PAC] Introduce AArch64::PAC pseudo instruction #146488
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

llvmbot · 2025-07-23T15:05:26Z

@llvm/pr-subscribers-llvm-transforms

Author: Anatoly Trosinenko (atrosinenko)

Changes

GVN PRE, SimplifyCFG and possibly other passes may hoist or sink the call to @llvm.ptrauth.blend intrinsic, introducing multiple duplicate call instructions hidden behind a PHI node. This prevents the instruction selector from generating safer code by absorbing the address and immediate modifiers into separate operands of AUT, PAC, etc. pseudo instruction.

This patch makes CodeGenPrepare pass detect when discriminator is computed as a PHI node with all incoming values being blends with the same immediate modifier. Each such discriminator value is replaced by a single blend, whose address argument is computed by a PHI node.

Full diff: https://github.com/llvm/llvm-project/pull/150226.diff

2 Files Affected:

(modified) llvm/lib/CodeGen/CodeGenPrepare.cpp (+75)
(added) llvm/test/Transforms/CodeGenPrepare/AArch64/ptrauth.ll (+142)

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index dc8184394f74d..17eba342dc0e3 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -444,6 +444,7 @@ class CodeGenPrepare {
   bool optimizeSwitchPhiConstants(SwitchInst *SI);
   bool optimizeSwitchInst(SwitchInst *SI);
   bool optimizeExtractElementInst(Instruction *Inst);
+  bool optimizePtrauthInst(Instruction *Inst, Value *Disc);
   bool dupRetToEnableTailCallOpts(BasicBlock *BB, ModifyDT &ModifiedDT);
   bool fixupDbgVariableRecord(DbgVariableRecord &I);
   bool fixupDbgVariableRecordsOnInst(Instruction &I);
@@ -2765,6 +2766,12 @@ bool CodeGenPrepare::optimizeCallInst(CallInst *CI, ModifyDT &ModifiedDT) {
       return optimizeGatherScatterInst(II, II->getArgOperand(0));
     case Intrinsic::masked_scatter:
       return optimizeGatherScatterInst(II, II->getArgOperand(1));
+    case Intrinsic::ptrauth_auth:
+    case Intrinsic::ptrauth_sign:
+      return optimizePtrauthInst(II, II->getArgOperand(2));
+    case Intrinsic::ptrauth_resign:
+      return optimizePtrauthInst(II, II->getArgOperand(2)) ||
+             optimizePtrauthInst(II, II->getArgOperand(4));
     }
 
     SmallVector<Value *, 2> PtrOps;
@@ -8311,6 +8318,74 @@ bool CodeGenPrepare::optimizeExtractElementInst(Instruction *Inst) {
   return false;
 }
 
+// Given the instruction Inst, rewrite its discriminator operand Disc if it is
+// a PHI node with all incoming values being @llvm.ptrauth.blend(addr, imm)
+// with the same immediate modifier.
+bool CodeGenPrepare::optimizePtrauthInst(Instruction *Inst, Value *Disc) {
+  // GVN PRE, SimplifyCFG and possibly other passes may hoist or sink the call
+  // to @llvm.ptrauth.blend intrinsic, introducing multiple duplicate call
+  // instructions hidden behind a PHI node.
+  //
+  // To enforce the specific immediate modifier being blended into the
+  // discriminator even if the other, address component was reloaded from the
+  // stack, the AArch64 backend defines a number of pseudo instructions
+  // representing auth, sign and other operations. These pseudo instructions
+  // have separate register and immediate operands absorbing the arguments of
+  // blend intrinsic in case of a common `op(ptr, key_id, blend(addr, imm))`
+  // code pattern.
+  //
+  // To help the instruction selector, this function detects the discriminators
+  // represented by a PHI node with all incoming values being `blend`s with
+  // the same integer operand - each such discriminator is rewritten as a single
+  // blend, whose address operand is a PHI node.
+
+  PHINode *P = dyn_cast<PHINode>(Disc);
+  if (!P)
+    return false;
+
+  // Checks if V is blend(something, imm), returns imm if it is.
+  auto GetImmModifier = [](Value *V) -> std::optional<uint64_t> {
+    auto *II = dyn_cast<IntrinsicInst>(V);
+    if (!II || II->getIntrinsicID() != Intrinsic::ptrauth_blend)
+      return std::nullopt;
+    auto *ImmModifier = dyn_cast<ConstantInt>(II->getArgOperand(1));
+    if (!ImmModifier)
+      return std::nullopt;
+    return ImmModifier->getZExtValue();
+  };
+
+  // Check that all incoming values of P are blends with the same immediate
+  // modifier.
+  std::optional<uint64_t> ImmModifier = GetImmModifier(*P->op_begin());
+  if (!ImmModifier ||
+      !llvm::all_of(P->operands(), [&ImmModifier, GetImmModifier](Value *V) {
+        return ImmModifier == GetImmModifier(V);
+      }))
+    return false;
+
+  // At this point, P is a PHI node, with all the incoming values being a blend
+  // operations with the same immediate modifier, but possibly different address
+  // modifiers. Replace Disc argument of Inst with a single ptrauth_blend
+  // whose address modifier if provided by a PHI node.
+
+  IRBuilder<> Builder(Inst->getContext());
+  Type *Int64Ty = Builder.getInt64Ty();
+
+  Builder.SetInsertPoint(P);
+  PHINode *AddrDiscPhi = Builder.CreatePHI(Int64Ty, P->getNumIncomingValues());
+  for (auto [Val, BB] : llvm::zip_equal(P->operands(), P->blocks())) {
+    Value *AddrDisc = cast<IntrinsicInst>(Val)->getArgOperand(0);
+    AddrDiscPhi->addIncoming(AddrDisc, BB);
+  }
+
+  Builder.SetInsertPoint(Inst);
+  Value *ImmModifValue = ConstantInt::get(Int64Ty, *ImmModifier);
+  Value *Blend = Builder.CreateIntrinsic(Intrinsic::ptrauth_blend,
+                                         {AddrDiscPhi, ImmModifValue});
+  Inst->replaceUsesOfWith(Disc, Blend);
+  return true;
+}
+
 /// For the instruction sequence of store below, F and I values
 /// are bundled together as an i64 value before being stored into memory.
 /// Sometimes it is more efficient to generate separate stores for F and I,
diff --git a/llvm/test/Transforms/CodeGenPrepare/AArch64/ptrauth.ll b/llvm/test/Transforms/CodeGenPrepare/AArch64/ptrauth.ll
new file mode 100644
index 0000000000000..7f06137c891de
--- /dev/null
+++ b/llvm/test/Transforms/CodeGenPrepare/AArch64/ptrauth.ll
@@ -0,0 +1,142 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --version 5
+; RUN: opt -mtriple=aarch64-linux-pauthtest -S -passes='require<profile-summary>,function(codegenprepare)' < %s | FileCheck %s
+; RUN: opt -mtriple=arm64e-apple-darwin     -S -passes='require<profile-summary>,function(codegenprepare)' < %s | FileCheck %s
+
+define i64 @choose_disc(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-LABEL: define i64 @choose_disc(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br i1 %cond, label %bb.blend.1, label %bb.blend.2
+; CHECK:       bb.blend.1:
+; CHECK-NEXT:    %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+; CHECK-NEXT:    br label %bb.user
+; CHECK:       bb.blend.2:
+; CHECK-NEXT:    %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+; CHECK-NEXT:    br label %bb.user
+; CHECK:       bb.user:
+; CHECK-NEXT:    %0 = phi i64 [ %addr.1, %bb.blend.1 ], [ %addr.2, %bb.blend.2 ]
+; CHECK-NEXT:    %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+; CHECK-NEXT:    %1 = call i64 @llvm.ptrauth.blend(i64 %0, i64 1234)
+; CHECK-NEXT:    %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %1)
+; CHECK-NEXT:    ret i64 %result
+;
+entry:
+  br i1 %cond, label %bb.blend.1, label %bb.blend.2
+bb.blend.1:
+  %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+  br label %bb.user
+bb.blend.2:
+  %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+  br label %bb.user
+bb.user:
+  %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+  %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %disc)
+  ret i64 %result
+}
+
+define i64 @choose_disc_different_imm(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-LABEL: define i64 @choose_disc_different_imm(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br i1 %cond, label %bb.blend.1, label %bb.blend.2
+; CHECK:       bb.blend.1:
+; CHECK-NEXT:    %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+; CHECK-NEXT:    br label %bb.user
+; CHECK:       bb.blend.2:
+; CHECK-NEXT:    %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 42)
+; CHECK-NEXT:    br label %bb.user
+; CHECK:       bb.user:
+; CHECK-NEXT:    %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+; CHECK-NEXT:    %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %disc)
+; CHECK-NEXT:    ret i64 %result
+;
+entry:
+  br i1 %cond, label %bb.blend.1, label %bb.blend.2
+bb.blend.1:
+  %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+  br label %bb.user
+bb.blend.2:
+  %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 42)
+  br label %bb.user
+bb.user:
+  %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+  %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %disc)
+  ret i64 %result
+}
+
+define i64 @choose_disc_multi_edge(i32 %arg, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-LABEL: define i64 @choose_disc_multi_edge(i32 %arg, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    %cond = icmp eq i32 %arg, 0
+; CHECK-NEXT:    br i1 %cond, label %bb.blend.1, label %bb.blend.2
+; CHECK:       bb.blend.1:
+; CHECK-NEXT:    %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+; CHECK-NEXT:    switch i32 %arg, label %bb.blend.2 [
+; CHECK-NEXT:      i32 0, label %bb.user
+; CHECK-NEXT:      i32 3, label %bb.user
+; CHECK-NEXT:    ]
+; CHECK:       bb.blend.2:
+; CHECK-NEXT:    %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+; CHECK-NEXT:    br label %bb.user
+; CHECK:       bb.user:
+; CHECK-NEXT:    %0 = phi i64 [ %addr.1, %bb.blend.1 ], [ %addr.1, %bb.blend.1 ], [ %addr.2, %bb.blend.2 ]
+; CHECK-NEXT:    %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+; CHECK-NEXT:    %1 = call i64 @llvm.ptrauth.blend(i64 %0, i64 1234)
+; CHECK-NEXT:    %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %1)
+; CHECK-NEXT:    ret i64 %result
+;
+entry:
+  %cond = icmp eq i32 %arg, 0
+  br i1 %cond, label %bb.blend.1, label %bb.blend.2
+bb.blend.1:
+  %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+  switch i32 %arg, label %bb.blend.2 [
+  i32 0, label %bb.user
+  i32 2, label %bb.blend.2
+  i32 3, label %bb.user
+  ]
+bb.blend.2:
+  %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+  br label %bb.user
+bb.user:
+  %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+  %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %disc)
+  ret i64 %result
+}
+
+define i64 @choose_disc_phi_placement(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-LABEL: define i64 @choose_disc_phi_placement(i1 %cond, i64 %addr.1, i64 %addr.2, i64 %ptr) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br i1 %cond, label %bb.blend.1, label %bb.blend.2
+; CHECK:       bb.blend.1:
+; CHECK-NEXT:    %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+; CHECK-NEXT:    br label %bb.phi
+; CHECK:       bb.blend.2:
+; CHECK-NEXT:    %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+; CHECK-NEXT:    br label %bb.phi
+; CHECK:       bb.phi:
+; CHECK-NEXT:    %0 = phi i64 [ %addr.1, %bb.blend.1 ], [ %addr.2, %bb.blend.2 ]
+; CHECK-NEXT:    %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+; CHECK-NEXT:    br i1 %cond, label %bb.user, label %bb.ret0
+; CHECK:       bb.user:
+; CHECK-NEXT:    %1 = call i64 @llvm.ptrauth.blend(i64 %0, i64 1234)
+; CHECK-NEXT:    %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %1)
+; CHECK-NEXT:    ret i64 %result
+; CHECK:       bb.ret0:
+; CHECK-NEXT:    ret i64 0
+;
+entry:
+  br i1 %cond, label %bb.blend.1, label %bb.blend.2
+bb.blend.1:
+  %disc.1 = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 1234)
+  br label %bb.phi
+bb.blend.2:
+  %disc.2 = call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 1234)
+  br label %bb.phi
+bb.phi:
+  %disc = phi i64 [ %disc.1, %bb.blend.1 ], [ %disc.2, %bb.blend.2 ]
+  br i1 %cond, label %bb.user, label %bb.ret0
+bb.user:
+  %result = call i64 @llvm.ptrauth.sign(i64 %ptr, i32 0, i64 %disc)
+  ret i64 %result
+bb.ret0:
+  ret i64 0
+}

GVN PRE, SimplifyCFG and possibly other passes may hoist the call to `@llvm.ptrauth.blend` intrinsic, introducing multiple duplicate call instructions hidden behind a PHI node. This prevents the instruction selector from generating safer code by absorbing the address and immediate modifiers into separate operands of AUT, PAC, etc. pseudo instruction. This patch makes CodeGenPrepare pass detect when discriminator is computed as a PHI node with all incoming values being blends with the same immediate modifier. Each such discriminator value is replaced by a single blend, whose address argument is computed by a PHI node.

ojhunt · 2025-07-29T01:26:44Z

Out of curiosity what's an example of code this impacts?

ojhunt

r=me with additional tests for variable discriminators

ojhunt · 2025-07-29T01:44:01Z

llvm/lib/CodeGen/CodeGenPrepare.cpp

+    if (!II || II->getIntrinsicID() != Intrinsic::ptrauth_blend)
+      return std::nullopt;
+    auto *ImmModifier = dyn_cast<ConstantInt>(II->getArgOperand(1));
+    if (!ImmModifier)


I was about to ask "when is there no second argument" because apparently reading dyn_cast was too hard for me :D

but is this limitation strictly required? is the concern spilling a variable extra discriminator along the lines of?
(pseudo ir)

... %a = call i64 @llvm.ptrauth.blend(i64 %addr.1, i64 %v1) ... %b= call i64 @llvm.ptrauth.blend(i64 %addr.2, i64 %v1) ... %r = phi(%a,%b)

getting turned into

%v1 = ... ; spill %v1 ... ; load %v1 %x = phi(%a, %b) %r = call i64 @llvm.ptrauth.blend(i64 %x, i64 %v1)

ojhunt · 2025-07-29T01:57:22Z

llvm/lib/CodeGen/CodeGenPrepare.cpp

+// Given the instruction Inst, rewrite its discriminator operand Disc if it is
+// a PHI node with all incoming values being @llvm.ptrauth.blend(addr, imm)
+// with the same immediate modifier.
+bool CodeGenPrepare::optimizePtrauthInst(Instruction *Inst, Value *Disc) {


This matching is fine for now, but the llvm intrinsic does not require the parameter to be a pointer. I think the best path forward would be to simply make the intrinsic match the expected interface? In principle no one should be emitting @llvm.ptrauth.blend(i64, (i64)pointer) (reversing the normal blend parameters), but maybe we should simply make it not an option to construct the intrinsic in that order?

That will require clang fixes to remove casts in the codegen, but also would presumably simplify things as it's removing those casts?

Maybe @ahmedbougacha or @rjmccall know why it is set up like this?

ojhunt · 2025-07-29T01:58:08Z

llvm/test/Transforms/CodeGenPrepare/AArch64/ptrauth.ll

@@ -0,0 +1,142 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --version 5


Tests look fine but I'd like a counter example with non-constant discriminator to track that behavior.

atrosinenko force-pushed the users/atrosinenko/pauth-codegenprepare-blend branch from 59557cf to 06d114c Compare July 23, 2025 14:52

atrosinenko changed the base branch from main to users/atrosinenko/pauth-imm-modifier-other July 23, 2025 15:00

atrosinenko force-pushed the users/atrosinenko/pauth-codegenprepare-blend branch from 06d114c to 7f521df Compare July 23, 2025 15:00

This was referenced Jul 23, 2025

[AArch64][PAC] Introduce AArch64::PAC pseudo instruction #146488

Merged

[AArch64][PAC] Rework discriminator analysis in AUT and AUTPAC #146489

Open

[AArch64][PAC] Rework discriminator analysis for calls and tail calls #147136

Open

atrosinenko mentioned this pull request Jul 23, 2025

[AArch64][PAC] Skip llvm.ptrauth.blend intrinsic in GVN PRE #147815

Closed

atrosinenko requested review from asl, ahmedbougacha, kbeyls, kovdan01, pcc and ojhunt July 23, 2025 15:04

atrosinenko marked this pull request as ready for review July 23, 2025 15:04

llvmbot added llvm:codegen llvm:transforms labels Jul 23, 2025

atrosinenko force-pushed the users/atrosinenko/pauth-codegenprepare-blend branch from 7f521df to b8d9540 Compare July 25, 2025 10:14

atrosinenko force-pushed the users/atrosinenko/pauth-imm-modifier-other branch from bb7815b to 66aa206 Compare July 25, 2025 10:14

ojhunt approved these changes Jul 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226

[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226

Uh oh!

atrosinenko commented Jul 23, 2025 •

edited

Loading

Uh oh!

atrosinenko commented Jul 23, 2025

Uh oh!

llvmbot commented Jul 23, 2025

Uh oh!

ojhunt commented Jul 29, 2025

Uh oh!

ojhunt left a comment

Uh oh!

ojhunt Jul 29, 2025

Uh oh!

ojhunt Jul 29, 2025

Uh oh!

ojhunt Jul 29, 2025

Uh oh!

Uh oh!

		@@ -0,0 +1,142 @@
		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: -p --version 5

[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226

Are you sure you want to change the base?

[CGP][PAC] Flip PHI and blends when all immediate modifiers are the same #150226

Uh oh!

Conversation

atrosinenko commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

atrosinenko commented Jul 23, 2025

Uh oh!

llvmbot commented Jul 23, 2025

Uh oh!

ojhunt commented Jul 29, 2025

Uh oh!

ojhunt left a comment

Choose a reason for hiding this comment

Uh oh!

ojhunt Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

ojhunt Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

ojhunt Jul 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

atrosinenko commented Jul 23, 2025 •

edited

Loading