Skip to content

AArch64: Relax x16/x17 constraint on AUT in certain cases. #132857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: users/pcc/spr/main.aarch64-relax-x16x17-constraint-on-aut-in-certain-cases
Choose a base branch
from

Conversation

pcc
Copy link
Contributor

@pcc pcc commented Mar 25, 2025

On most operating systems, the x16 and x17 registers are not special,
so there is no benefit, and only a code size cost, to constraining AUT to
only using them. Therefore, adjust the backend to only use the AUT pseudo
(renamed AUTx16x17 for clarity) on Darwin platforms. All other platforms
use an unconstrained variant of the pseudo, AUTxMxN, for selection.

Created using spr 1.3.6-beta.1
@llvmbot
Copy link
Member

llvmbot commented Mar 25, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Peter Collingbourne (pcc)

Changes

On most operating systems, the x16 and x17 registers are not special,
so there is no benefit, and only a code size cost, to constraining AUT
to only using them. Therefore, adjust the backend to only use the AUT
pseudo (renamed AUTx16x17 for clarity) on Darwin platforms, or if traps
are requested. All other platforms use the unconstrained variant of the
instruction for selection.


Patch is 35.18 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/132857.diff

10 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+1-10)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (+6-4)
  • (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.td (+18-7)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.cpp (+23)
  • (modified) llvm/lib/Target/AArch64/AArch64Subtarget.h (+8)
  • (modified) llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp (+1-1)
  • (modified) llvm/test/CodeGen/AArch64/ptrauth-call.ll (+37-16)
  • (modified) llvm/test/CodeGen/AArch64/ptrauth-fpac.ll (+47-34)
  • (modified) llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll (+27-21)
  • (modified) llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign.ll (+138-118)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 79b190388eb75..73d3fb575897e 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -69,15 +69,6 @@
 
 using namespace llvm;
 
-enum PtrauthCheckMode { Default, Unchecked, Poison, Trap };
-static cl::opt<PtrauthCheckMode> PtrauthAuthChecks(
-    "aarch64-ptrauth-auth-checks", cl::Hidden,
-    cl::values(clEnumValN(Unchecked, "none", "don't test for failure"),
-               clEnumValN(Poison, "poison", "poison on failure"),
-               clEnumValN(Trap, "trap", "trap on failure")),
-    cl::desc("Check pointer authentication auth/resign failures"),
-    cl::init(Default));
-
 #define DEBUG_TYPE "asm-printer"
 
 namespace {
@@ -2868,7 +2859,7 @@ void AArch64AsmPrinter::emitInstruction(const MachineInstr *MI) {
     return;
   }
 
-  case AArch64::AUT:
+  case AArch64::AUTx16x17:
   case AArch64::AUTPAC:
     emitPtrauthAuthResign(MI);
     return;
diff --git a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
index 22083460b400a..2eafb8eae2082 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
@@ -361,7 +361,7 @@ class AArch64DAGToDAGISel : public SelectionDAGISel {
 
   bool tryIndexedLoad(SDNode *N);
 
-  void SelectPtrauthAuth(SDNode *N);
+  void SelectPtrauthAuthX16X17(SDNode *N);
   void SelectPtrauthResign(SDNode *N);
 
   bool trySelectStackSlotTagP(SDNode *N);
@@ -1521,7 +1521,7 @@ extractPtrauthBlendDiscriminators(SDValue Disc, SelectionDAG *DAG) {
       AddrDisc);
 }
 
-void AArch64DAGToDAGISel::SelectPtrauthAuth(SDNode *N) {
+void AArch64DAGToDAGISel::SelectPtrauthAuthX16X17(SDNode *N) {
   SDLoc DL(N);
   // IntrinsicID is operand #0
   SDValue Val = N->getOperand(1);
@@ -1539,7 +1539,7 @@ void AArch64DAGToDAGISel::SelectPtrauthAuth(SDNode *N) {
                                          AArch64::X16, Val, SDValue());
   SDValue Ops[] = {AUTKey, AUTConstDisc, AUTAddrDisc, X16Copy.getValue(1)};
 
-  SDNode *AUT = CurDAG->getMachineNode(AArch64::AUT, DL, MVT::i64, Ops);
+  SDNode *AUT = CurDAG->getMachineNode(AArch64::AUTx16x17, DL, MVT::i64, Ops);
   ReplaceNode(N, AUT);
 }
 
@@ -5613,7 +5613,9 @@ void AArch64DAGToDAGISel::Select(SDNode *Node) {
       return;
 
     case Intrinsic::ptrauth_auth:
-      SelectPtrauthAuth(Node);
+      if (!Subtarget->isX16X17Safer(CurDAG->getMachineFunction()))
+        break;
+      SelectPtrauthAuthX16X17(Node);
       return;
 
     case Intrinsic::ptrauth_resign:
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index 6c61e3a613f6f..98a35b3840771 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -66,6 +66,14 @@ def HasLOR           : Predicate<"Subtarget->hasLOR()">,
 def HasPAuth         : Predicate<"Subtarget->hasPAuth()">,
                        AssemblerPredicateWithAll<(all_of FeaturePAuth), "pauth">;
 
+// On most operating systems, the x16 and x17 registers are not special, so
+// there is no benefit, and only a code size cost, to constraining PAC
+// instructions to only using them. This predicate may be used to guard patterns
+// that allow PAC instructions to be used with any register.
+let RecomputePerFunction = 1 in {
+  def X16X17NotSafer   : Predicate<"!Subtarget->isX16X17Safer(*MF)">;
+}
+
 def HasPAuthLR       : Predicate<"Subtarget->hasPAuthLR()">,
                        AssemblerPredicateWithAll<(all_of FeaturePAuthLR), "pauth-lr">;
 
@@ -1820,7 +1828,9 @@ let Predicates = [HasPAuth] in {
   }
 
   defm PAC : SignAuth<0b000, 0b010, "pac", int_ptrauth_sign>;
-  defm AUT : SignAuth<0b001, 0b011, "aut", null_frag>;
+  let Predicates = [HasPAuth, X16X17NotSafer] in {
+    defm AUT : SignAuth<0b001, 0b011, "aut", int_ptrauth_auth>;
+  }
 
   def XPACI : ClearAuth<0, "xpaci">;
   def : Pat<(int_ptrauth_strip GPR64:$Rd, 0), (XPACI GPR64:$Rd)>;
@@ -1912,10 +1922,11 @@ let Predicates = [HasPAuth] in {
   defm LDRAB  : AuthLoad<1, "ldrab", simm10Scaled>;
 
   // AUT pseudo.
-  // This directly manipulates x16/x17, which are the only registers the OS
-  // guarantees are safe to use for sensitive operations.
-  def AUT : Pseudo<(outs), (ins i32imm:$Key, i64imm:$Disc, GPR64noip:$AddrDisc),
-                   []>, Sched<[WriteI, ReadI]> {
+  // This directly manipulates x16/x17, which are the only registers that
+  // certain OSs guarantee are safe to use for sensitive operations.
+  def AUTx16x17 : Pseudo<(outs), (ins i32imm:$Key, i64imm:$Disc,
+                                      GPR64noip:$AddrDisc),
+                      []>, Sched<[WriteI, ReadI]> {
     let isCodeGenOnly = 1;
     let hasSideEffects = 1;
     let mayStore = 0;
@@ -1926,8 +1937,8 @@ let Predicates = [HasPAuth] in {
   }
 
   // AUT and re-PAC a value, using different keys/data.
-  // This directly manipulates x16/x17, which are the only registers the OS
-  // guarantees are safe to use for sensitive operations.
+  // This directly manipulates x16/x17, which are the only registers that
+  // certain OSs guarantee are safe to use for sensitive operations.
   def AUTPAC
       : Pseudo<(outs),
                (ins i32imm:$AUTKey, i64imm:$AUTDisc, GPR64noip:$AUTAddrDisc,
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
index 0c2a4eb7374e0..a2515731a8f11 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.cpp
@@ -102,6 +102,16 @@ static cl::opt<bool>
     UseScalarIncVL("sve-use-scalar-inc-vl", cl::init(false), cl::Hidden,
                    cl::desc("Prefer add+cnt over addvl/inc/dec"));
 
+cl::opt<PtrauthCheckMode> llvm::PtrauthAuthChecks(
+    "aarch64-ptrauth-auth-checks", cl::Hidden,
+    cl::values(clEnumValN(PtrauthCheckMode::Unchecked, "none",
+                          "don't test for failure"),
+               clEnumValN(PtrauthCheckMode::Poison, "poison",
+                          "poison on failure"),
+               clEnumValN(PtrauthCheckMode::Trap, "trap", "trap on failure")),
+    cl::desc("Check pointer authentication auth/resign failures"),
+    cl::init(PtrauthCheckMode::Default));
+
 unsigned AArch64Subtarget::getVectorInsertExtractBaseCost() const {
   if (OverrideVectorInsertExtractBaseCost.getNumOccurrences() > 0)
     return OverrideVectorInsertExtractBaseCost;
@@ -654,6 +664,19 @@ AArch64Subtarget::getPtrAuthBlockAddressDiscriminatorIfEnabled(
       (Twine(ParentFn.getName()) + " blockaddress").str());
 }
 
+bool AArch64Subtarget::isX16X17Safer(const MachineFunction &MF) const {
+  // The Darwin kernel implements special protections for x16 and x17 so we
+  // should prefer to use those registers on that platform.
+  if (isTargetDarwin())
+    return true;
+  // Traps are only implemented for the pseudo instructions, but are only
+  // necessary if FEAT_FPAC is not implemented.
+  if (hasFPAC())
+    return false;
+  return MF.getFunction().hasFnAttribute("ptrauth-auth-traps") ||
+         PtrauthAuthChecks == PtrauthCheckMode::Trap;
+}
+
 bool AArch64Subtarget::enableMachinePipeliner() const {
   return getSchedModel().hasInstrSchedModel();
 }
diff --git a/llvm/lib/Target/AArch64/AArch64Subtarget.h b/llvm/lib/Target/AArch64/AArch64Subtarget.h
index f5ffc72cae537..4e8f5f85146bd 100644
--- a/llvm/lib/Target/AArch64/AArch64Subtarget.h
+++ b/llvm/lib/Target/AArch64/AArch64Subtarget.h
@@ -26,6 +26,7 @@
 #include "llvm/CodeGen/RegisterBankInfo.h"
 #include "llvm/CodeGen/TargetSubtargetInfo.h"
 #include "llvm/IR/DataLayout.h"
+#include "llvm/Support/CommandLine.h"
 
 #define GET_SUBTARGETINFO_HEADER
 #include "AArch64GenSubtargetInfo.inc"
@@ -35,6 +36,9 @@ class GlobalValue;
 class StringRef;
 class Triple;
 
+enum class PtrauthCheckMode { Default, Unchecked, Poison, Trap };
+extern cl::opt<PtrauthCheckMode> PtrauthAuthChecks;
+
 class AArch64Subtarget final : public AArch64GenSubtargetInfo {
 public:
   enum ARMProcFamilyEnum : uint8_t {
@@ -318,6 +322,10 @@ class AArch64Subtarget final : public AArch64GenSubtargetInfo {
     }
   }
 
+  /// Returns whether the operating system makes it safer to store sensitive
+  /// values in x16 and x17 as opposed to other registers.
+  bool isX16X17Safer(const MachineFunction &MF) const;
+
   /// ParseSubtargetFeatures - Parses features string setting specified
   /// subtarget options.  Definition of function is auto generated by tblgen.
   void ParseSubtargetFeatures(StringRef CPU, StringRef TuneCPU, StringRef FS);
diff --git a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
index 67a08e39fe879..6164f6e1e7d12 100644
--- a/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
+++ b/llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
@@ -6735,7 +6735,7 @@ bool AArch64InstructionSelector::selectIntrinsic(MachineInstr &I,
 
     MIB.buildCopy({AArch64::X16}, {ValReg});
     MIB.buildInstr(TargetOpcode::IMPLICIT_DEF, {AArch64::X17}, {});
-    MIB.buildInstr(AArch64::AUT)
+    MIB.buildInstr(AArch64::AUTx16x17)
         .addImm(AUTKey)
         .addImm(AUTConstDiscC)
         .addUse(AUTAddrDisc)
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-call.ll b/llvm/test/CodeGen/AArch64/ptrauth-call.ll
index bf35cf8fecbdb..7eb4cfca40f09 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-call.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-call.ll
@@ -169,13 +169,23 @@ define i32 @test_tailcall_ib_var(ptr %arg0, ptr %arg1) #0 {
 
 define void @test_tailcall_omit_mov_x16_x16(ptr %objptr) #0 {
 ; CHECK-LABEL: test_tailcall_omit_mov_x16_x16:
-; CHECK-NEXT:    ldr     x16, [x0]
-; CHECK-NEXT:    mov     x17, x0
-; CHECK-NEXT:    movk    x17, #6503, lsl #48
-; CHECK-NEXT:    autda   x16, x17
-; CHECK-NEXT:    ldr     x1, [x16]
-; CHECK-NEXT:    movk    x16, #54167, lsl #48
-; CHECK-NEXT:    braa    x1, x16
+; DARWIN-NEXT:    ldr     x16, [x0]
+; DARWIN-NEXT:    mov     x17, x0
+; DARWIN-NEXT:    movk    x17, #6503, lsl #48
+; DARWIN-NEXT:    autda   x16, x17
+; DARWIN-NEXT:    ldr     x1, [x16]
+; DARWIN-NEXT:    movk    x16, #54167, lsl #48
+; DARWIN-NEXT:    braa    x1, x16
+; ELF-NEXT:       ldr     x1, [x0]
+; ELF-NEXT:       mov     x8, x0
+; ELF-NEXT:       movk    x8, #6503, lsl #48
+; ELF-NEXT:       autda   x1, x8
+; ELF-NEXT:       ldr     x2, [x1]
+; FIXME: Get rid of the x16/x17 constraint on non-Darwin so we can eliminate
+; this mov.
+; ELF-NEXT:       mov     x16, x1
+; ELF-NEXT:       movk    x16, #54167, lsl #48
+; ELF-NEXT:       braa    x2, x16
   %vtable.signed = load ptr, ptr %objptr, align 8
   %objptr.int = ptrtoint ptr %objptr to i64
   %vtable.discr = tail call i64 @llvm.ptrauth.blend(i64 %objptr.int, i64 6503)
@@ -191,16 +201,27 @@ define void @test_tailcall_omit_mov_x16_x16(ptr %objptr) #0 {
 define i32 @test_call_omit_extra_moves(ptr %objptr) #0 {
 ; CHECK-LABEL: test_call_omit_extra_moves:
 ; DARWIN-NEXT:   stp     x29, x30, [sp, #-16]!
-; ELF-NEXT:      str     x30, [sp, #-16]!
-; CHECK-NEXT:    ldr     x16, [x0]
-; CHECK-NEXT:    mov     x17, x0
-; CHECK-NEXT:    movk    x17, #6503, lsl #48
-; CHECK-NEXT:    autda   x16, x17
-; CHECK-NEXT:    ldr     x8, [x16]
-; CHECK-NEXT:    movk    x16, #34646, lsl #48
-; CHECK-NEXT:    blraa   x8, x16
-; CHECK-NEXT:    mov     w0, #42
+; DARWIN-NEXT:   ldr     x16, [x0]
+; DARWIN-NEXT:   mov     x17, x0
+; DARWIN-NEXT:   movk    x17, #6503, lsl #48
+; DARWIN-NEXT:   autda   x16, x17
+; DARWIN-NEXT:   ldr     x8, [x16]
+; DARWIN-NEXT:   movk    x16, #34646, lsl #48
+; DARWIN-NEXT:   blraa   x8, x16
+; DARWIN-NEXT:   mov     w0, #42
 ; DARWIN-NEXT:   ldp     x29, x30, [sp], #16
+; ELF-NEXT:      str     x30, [sp, #-16]!
+; ELF-NEXT:      ldr     x8, [x0]
+; ELF-NEXT:      mov     x9, x0
+; ELF-NEXT:      movk    x9, #6503, lsl #48
+; ELF-NEXT:      autda   x8, x9
+; ELF-NEXT:      ldr     x9, [x8]
+; FIXME: Get rid of the x16/x17 constraint on non-Darwin so we can eliminate
+; this mov.
+; ELF-NEXT:      mov     x17, x8
+; ELF-NEXT:      movk    x17, #34646, lsl #48
+; ELF-NEXT:      blraa   x9, x17
+; ELF-NEXT:      mov     w0, #42
 ; ELF-NEXT:      ldr     x30, [sp], #16
 ; CHECK-NEXT:    ret
   %vtable.signed = load ptr, ptr %objptr
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll b/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
index d5340dcebad57..41d3ffe32c31e 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-fpac.ll
@@ -1,17 +1,18 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc < %s -mtriple arm64e-apple-darwin                          -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,NOFPAC
-; RUN: llc < %s -mtriple arm64e-apple-darwin             -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,FPAC
-; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth              -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,NOFPAC
-; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,FPAC
+; RUN: llc < %s -mtriple arm64e-apple-darwin                          -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,DARWIN,NOFPAC
+; RUN: llc < %s -mtriple arm64e-apple-darwin             -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL="L"  --check-prefixes=ALL,DARWIN,FPAC,DARWIN-FPAC
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth              -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,ELF,NOFPAC
+; RUN: llc < %s -mtriple aarch64-linux-gnu -mattr=+pauth -mattr=+fpac -verify-machineinstrs | FileCheck %s -DL=".L" --check-prefixes=ALL,ELF,FPAC,ELF-FPAC
 
 target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
 define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_ia:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autia x16, x1
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autia x16, x1
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autia x0, x1
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 0, i64 %arg1)
   ret i64 %tmp
@@ -20,9 +21,10 @@ define i64 @test_auth_ia(i64 %arg, i64 %arg1) {
 define i64 @test_auth_ia_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_ia_zero:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autiza x16
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autiza x16
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autiza x0
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 0, i64 0)
   ret i64 %tmp
@@ -31,9 +33,10 @@ define i64 @test_auth_ia_zero(i64 %arg) {
 define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_ib:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autib x16, x1
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autib x16, x1
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autib x0, x1
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 1, i64 %arg1)
   ret i64 %tmp
@@ -42,9 +45,10 @@ define i64 @test_auth_ib(i64 %arg, i64 %arg1) {
 define i64 @test_auth_ib_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_ib_zero:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autizb x16
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autizb x16
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autizb x0
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 1, i64 0)
   ret i64 %tmp
@@ -53,9 +57,10 @@ define i64 @test_auth_ib_zero(i64 %arg) {
 define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_da:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autda x16, x1
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autda x16, x1
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autda x0, x1
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 2, i64 %arg1)
   ret i64 %tmp
@@ -64,9 +69,10 @@ define i64 @test_auth_da(i64 %arg, i64 %arg1) {
 define i64 @test_auth_da_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_da_zero:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autdza x16
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autdza x16
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autdza x0
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 2, i64 0)
   ret i64 %tmp
@@ -75,9 +81,10 @@ define i64 @test_auth_da_zero(i64 %arg) {
 define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 ; ALL-LABEL: test_auth_db:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autdb x16, x1
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autdb x16, x1
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autdb x0, x1
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 3, i64 %arg1)
   ret i64 %tmp
@@ -86,9 +93,10 @@ define i64 @test_auth_db(i64 %arg, i64 %arg1) {
 define i64 @test_auth_db_zero(i64 %arg) {
 ; ALL-LABEL: test_auth_db_zero:
 ; ALL:       %bb.0:
-; ALL-NEXT:    mov x16, x0
-; ALL-NEXT:    autdzb x16
-; ALL-NEXT:    mov x0, x16
+; DARWIN-NEXT: mov x16, x0
+; DARWIN-NEXT: autdzb x16
+; DARWIN-NEXT: mov x0, x16
+; ELF-NEXT:    autdzb x0
 ; ALL-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 3, i64 0)
   ret i64 %tmp
@@ -362,12 +370,17 @@ define i64 @test_auth_trap_attribute(i64 %arg, i64 %arg1) "ptrauth-auth-traps" {
 ; NOFPAC-NEXT:    mov x0, x16
 ; NOFPAC-NEXT:    ret
 ;
-; FPAC-LABEL: test_auth_trap_attribute:
-; FPAC:       %bb.0:
-; FPAC-NEXT:    mov x16, x0
-; FPAC-NEXT:    autia x16, x1
-; FPAC-NEXT:    mov x0, x16
-; FPAC-NEXT:    ret
+; DARWIN-FPAC-LABEL: test_auth_trap_attribute:
+; DARWIN-FPAC:       %bb.0:
+; DARWIN-FPAC-NEXT:    mov x16, x0
+; DARWIN-FPAC-NEXT:    autia x16, x1
+; DARWIN-FPAC-NEXT:    mov x0, x16
+; DARWIN-FPAC-NEXT:    ret
+;
+; ELF-FPAC-LABEL: test_auth_trap_attribute:
+; ELF-FPAC:       %bb.0:
+; ELF-FPAC-NEXT:    autia x0, x1
+; ELF-FPAC-NEXT:    ret
   %tmp = call i64 @llvm.ptrauth.auth(i64 %arg, i32 0, i64 %arg1)
   ret i64 %tmp
 }
diff --git a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
index 74d2370c74c54..ab8ce04b4816d 100644
--- a/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
+++ b/llvm/test/CodeGen/AArch64/ptrauth-intrinsic-auth-resign-with-blend.ll
@@ -33,21 +33,27 @@ target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
 
 define i64 @test_auth_blend(i64 %arg, i64 %arg1) {
 ; UNCHECKED-LABEL: test_auth_blend:
-; UNCHECKED:       %bb.0:
-; UNCHECKED-NEXT:    mov x16, x0
-; UNCHECKED-NEXT:    mov x17, x1
-; UNCHECKED-NEXT:    movk x17, #65535, lsl #48
-; UNCHECKED-NEXT:    autda x16, x17
-; UNCHECKED-NEXT:    mov x0, x16
-; UNCHECKED-NEXT:    ret
+; UNCHECKED:          %bb.0:
+; UNCHECKED-DARWIN-NEXT: mov x16, x0
+; UNCHECKED-DARWIN-NEXT: mov x17, x1
+; UNCHECKED-DARWIN-NEXT: movk x17, #65535, lsl #48
+; UNCHECKED-DARWIN-NEXT: autda x16, x17
+; UNCHECKED-DARWIN-NEXT: mov x0, x16
+; UNCHECKED-ELF-NEXT:    mov x8, x1
+; UNCHECKED-ELF-NEXT:    movk x8, #65535, lsl #48
+; UNCHECKED-ELF-NEXT:    autda x0, x8
+; UNCHECKED-NEXT:        ret
 ;
 ; CHECKED-LABEL: test_auth_blend:
-; CHECKED:       %bb.0:
-; CHECKED-NEXT:    mov x16, x0
-; CHECKED-NEXT:    mov x17, x1
-; CHECKED-NEXT:    movk x17, #65535, lsl #48
-; CHECKED-NEXT:    autda x16, x17
-; CHECKED-NEXT:    mov x0, x16
+; CHECKED:           %bb.0:
+; CHECKED-DARWIN-NEXT: mov x16, x0
+; CHECKED-DARWIN-NEXT: mov x17, x1
+; CHECKED-DARWIN-NEXT: movk x17, #65535, lsl #48
+; CHECKED-DARWIN-NEXT: autda x16, x17
+; CHECKED-DARWIN-NEXT: mov x0, x16
+; CHECKED-ELF-NEXT:    mov x8, x1
+; CHECKED-ELF-NEXT:    movk x8, #65535, lsl #48
+; CHECKED-ELF-NEXT:    autda x0, x8
 ; CHECKED-NEXT:    ret
 ;
 ; TRAP-LABEL: test_auth_blend:
@@ -146,10 +152,10 @@ define i64 @test_resign_blend_and_const(i64 %arg, i64 %arg1) {
 ; CHECKED-NEXT:    mov x17, x16
 ; CHECKED-NEXT:    xpacd x17
 ; CHECKED-NEXT:    cmp x16, x17
-; CHECKED-NEXT:    b.eq [[L]]auth_success_1
+; CHECKED-NEXT:    b.eq [[L]]auth_success_[[N2:[0-9]]]
 ; CHECKED-NEXT:    mov x16, x17
 ; CHECKED-NEXT:    b ...
[truncated]

@pcc pcc requested review from atrosinenko and kovdan01 March 26, 2025 01:31
Created using spr 1.3.6-beta.1
Comment on lines 214 to 226
; ELF-NEXT: ldr x8, [x0]
; ELF-NEXT: mov x9, x0
; ELF-NEXT: movk x9, #6503, lsl #48
; ELF-NEXT: autda x8, x9
; ELF-NEXT: ldr x9, [x8]
; FIXME: Get rid of the x16/x17 constraint on non-Darwin so we can eliminate
; this mov.
; ELF-NEXT: mov x17, x8
; ELF-NEXT: movk x17, #34646, lsl #48
; ELF-NEXT: blraa x9, x17
; ELF-NEXT: mov w0, #42
; ELF-NEXT: ldr x30, [sp], #16
; CHECK-NEXT: ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like x0 could be updated in-place, as its value doesn't seem to be used after the blend operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x0 is used as an argument to the function called by BLRAA, so I don't think it can be updated in-place.

That being said, I did notice that we don't appear to need the PAUTH_BLEND pseudo-instruction because we can just use MOVK instead, and I found that this allowed the register to be updated in-place in a couple of cases in ptrauth-intrinsic-auth-resign-with-blend.ll where it is not live after the blend. I uploaded #134765 with that change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, didn't notice that one of the instructions is a call.

@atrosinenko
Copy link
Contributor

Tagging @asl

pcc added 4 commits April 7, 2025 18:27
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
@pcc
Copy link
Contributor Author

pcc commented Apr 24, 2025

@asl Ping

@atrosinenko
Copy link
Contributor

I wonder if this PR could be streamlined - here are my thoughts that are in no way a change request, but merely an idea that could hopefully simplify the patch.

This patch has a couple of tricky parts:

  • it exposes an X16X17NotSafer predicate to TableGen code and uses it to conditionalize the tablegenerated instruction selection code in some clever way
  • the AArch64Subtarget::isX16X17Safer function computes its decision according to both the target (which is by design) and to the extra features requested (which are simply not implemented for non-default register choices, if I understand the reason correctly)

What if keep defm AUT : SignAuth<0b001, 0b011, "aut", null_frag> and teach AUT pseudo to handle non-standard scratch registers instead? More precisely, replace AUT with AUTx16x17 and, say, AUTxNxM:

  • AUTxNxM specifies most of the register operands explicitly:
    • implicit outputs: NZCV
    • explicit outputs: reg:Result (tied to InputPointer), reg:Scratch
    • implicit inputs: none
    • explicit inputs: imm:Key, imm:Disc, reg:AddrDisc, reg:InputPointer
  • AUTx16x17 is merely the renamed AUT pseudo, as it is already done in this PR:
    • implicit outputs: X16, X17, NZCV
    • explicit outputs: none
    • implicit inputs: X16
    • explicit inputs: imm:Key, imm:Disc, reg:AddrDisc
    • register allocator is likely to expect very restricted usage of physical registers (something like only by copy to/from physreg glued to the corresponding instruction with implicit operands), so selecting AUTxNxM with physical registers X16 and X17 would likely not work

This way, the changes to both DAGISel and GlobalISel custom instruction selection hooks would not be restricted to changing a single identifier. Though, these hooks are both rather short and the changes are still rather simple: wrap a few lines of linear code with if (isX16X17Safer) { ... } and emit AUTxNxM in the false branch. It looks like changes to the AsmPrinter would be very straightforward: add two Register operands to emitPtrauthAuthResign to be used instead of hardcoded X16 and X17 - in the caller either use Result in place of X16 and Scratch in place of X17 or just pass literal X16 and X17.

Created using spr 1.3.6-beta.1
@pcc
Copy link
Contributor Author

pcc commented May 14, 2025

Thanks for your suggestion, I tried it. This is the patch: https://github.com/pcc/llvm-project/tree/pfp-autxmxn

I found that it was more code (it adds 102 insertions(+), 66 deletions(-) on top of this patch). And I only implemented the SDAG side so far, not GISel. So I'm inclined not to go with it and instead try to improve the naming of the X16X17Safer stuff in this patch.

Created using spr 1.3.6-beta.1
@pcc
Copy link
Contributor Author

pcc commented May 24, 2025

I thought about it some more and decided to go with your suggested approach anyway. Among other things, #133536 will mean that deactivation symbol support will be needed for the pseudo instruction in order to avoid deactivation symbol relocations being silently dropped, so it's best if everything goes through the same code path.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions cpp,h -- llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp llvm/lib/Target/AArch64/AArch64Subtarget.cpp llvm/lib/Target/AArch64/AArch64Subtarget.h llvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index 80924c485..d8d029928 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -2153,8 +2153,7 @@ void AArch64AsmPrinter::emitPtrauthAuthResign(
 
   // Compute pac discriminator into x17
   assert(isUInt<16>(PACDisc));
-  Register PACDiscReg =
-      emitPtrauthDiscriminator(PACDisc, PACAddrDisc, Scratch);
+  Register PACDiscReg = emitPtrauthDiscriminator(PACDisc, PACAddrDisc, Scratch);
   bool PACZero = PACDiscReg == AArch64::XZR;
   unsigned PACOpc = getPACOpcodeForKey(*PACKey, PACZero);
 

pcc added a commit to pcc/llvm-project that referenced this pull request May 24, 2025
On most operating systems, the x16 and x17 registers are not special,
so there is no benefit, and only a code size cost, to constraining AUT to
only using them. Therefore, adjust the backend to only use the AUT pseudo
(renamed AUTx16x17 for clarity) on Darwin platforms. All other platforms
use an unconstrained variant of the pseudo, AUTxMxN, for selection.

Pull Request: llvm#132857
[]>, Sched<[WriteI, ReadI]> {
let Constraints = "$AuthVal = $Val";
let isCodeGenOnly = 1;
let hasSideEffects = 0;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reflects proposed change in #141330

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

3 participants