Skip to content

[ElimAvailExtern] Add an option to allow to convert global variables in a specified address space to local #144287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

shiltian
Copy link
Contributor

Currently, the EliminateAvailableExternallyPass only converts certain
available externally functions to local. For global variables, it only drops
their initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (__shared__ variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A __shared__ variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a static local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
__shared__ variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the EliminateAvailableExternallyPass, the global variable itself
is not.

It's safe to privatize such global variables, because they're local to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.

Copy link
Contributor Author

@llvmbot
Copy link
Member

llvmbot commented Jun 16, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Shilei Tian (shiltian)

Changes

Currently, the EliminateAvailableExternallyPass only converts certain
available externally functions to local. For global variables, it only drops
their initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (__shared__ variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A __shared__ variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a static local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
__shared__ variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the EliminateAvailableExternallyPass, the global variable itself
is not.

It's safe to privatize such global variables, because they're local to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.


Full diff: https://github.com/llvm/llvm-project/pull/144287.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/ElimAvailExtern.cpp (+30-1)
  • (added) llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll (+21)
diff --git a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
index 718452fc02764..a015ba8ccfd4a 100644
--- a/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
+++ b/llvm/lib/Transforms/IPO/ElimAvailExtern.cpp
@@ -35,8 +35,14 @@ static cl::opt<bool> ConvertToLocal(
     cl::desc("Convert available_externally into locals, renaming them "
              "to avoid link-time clashes."));
 
+static cl::opt<unsigned> ConvertGlobalVariableInAddrSpace(
+    "avail-extern-gv-in-addrspace-to-local", cl::Hidden,
+    cl::desc(
+        "Convert available_externally global variables into locals if they are "
+        "in specificed addrspace, renaming them to avoid link-time clashes."));
+
 STATISTIC(NumRemovals, "Number of functions removed");
-STATISTIC(NumConversions, "Number of functions converted");
+STATISTIC(NumConversions, "Number of functions and globalbs converted");
 STATISTIC(NumVariables, "Number of global variables removed");
 
 void deleteFunction(Function &F) {
@@ -88,9 +94,32 @@ static void convertToLocalCopy(Module &M, Function &F) {
   ++NumConversions;
 }
 
+static void convertToLocalCopy(Module &M, GlobalValue &GV) {
+  assert(GV.hasAvailableExternallyLinkage());
+  std::string OrigName = GV.getName().str();
+  std::string NewName = OrigName + ".__uniq" + getUniqueModuleId(&M);
+  GV.setName(NewName);
+  GV.setLinkage(GlobalValue::InternalLinkage);
+  ++NumConversions;
+}
+
 static bool eliminateAvailableExternally(Module &M, bool Convert) {
   bool Changed = false;
 
+  // Convert global variables in specified address space before changing it to
+  // external linkage below.
+  if (ConvertGlobalVariableInAddrSpace.getNumOccurrences()) {
+    for (GlobalVariable &GV : M.globals()) {
+      if (!GV.hasAvailableExternallyLinkage() || GV.use_empty())
+        continue;
+
+      if (GV.getAddressSpace() == ConvertGlobalVariableInAddrSpace)
+        convertToLocalCopy(M, GV);
+
+      Changed = true;
+    }
+  }
+
   // Drop initializers of available externally global variables.
   for (GlobalVariable &GV : M.globals()) {
     if (!GV.hasAvailableExternallyLinkage())
diff --git a/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll b/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll
new file mode 100644
index 0000000000000..6995b97e79887
--- /dev/null
+++ b/llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll
@@ -0,0 +1,21 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 5
+; RUN: opt -S -passes=elim-avail-extern -avail-extern-gv-in-addrspace-to-local=3 %s -o - | FileCheck %s
+
+@shared = internal addrspace(3) global i32 undef, align 4
+@shared.imported = available_externally hidden unnamed_addr addrspace(3) global i32 undef, align 4
+
+;.
+; CHECK: @shared = internal addrspace(3) global i32 undef, align 4
+; CHECK: @shared.imported.__uniq.[[UUID:.*]] = internal unnamed_addr addrspace(3) global i32 undef, align 4
+;.
+define void @foo(i32 %v) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: i32 [[V:%.*]]) {
+; CHECK-NEXT:    store i32 [[V]], ptr addrspace(3) @shared, align 4
+; CHECK-NEXT:    store i32 [[V]], ptr addrspace(3) @shared.imported.__uniq.[[UUID]], align 4
+; CHECK-NEXT:    ret void
+;
+  store i32 %v, ptr addrspace(3) @shared, align 4
+  store i32 %v, ptr addrspace(3) @shared.imported, align 4
+  ret void
+}

@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch from 3127751 to b633297 Compare June 16, 2025 02:19
Copy link

github-actions bot commented Jun 16, 2025

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll llvm/lib/Transforms/IPO/ElimAvailExtern.cpp llvm/test/Transforms/EliminateAvailableExternally/transform-to-local.ll

The following files introduce new uses of undef:

  • llvm/test/Transforms/EliminateAvailableExternally/convert-global-variables-to-local.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch from b633297 to 1cfc903 Compare June 16, 2025 04:07
…in a specified address space to local

Currently, the `EliminateAvailableExternallyPass` only converts certain
available externally functions to local. For global variables, it only drops
their initializers.

This PR adds an option to allow the pass to convert global variables in a
specified address space to local. The motivation for this change is to correctly
support lowering of LDS variables (`__shared__` variables, in more generic
terminology) when ThinLTO is enabled for AMDGPU.

A `__shared__` variable is lowered to a hidden global variable in a particular
address space by the frontend, which is roughly same as a `static` local
variable. To properly lower it in the backend, the compiler needs to check all
its uses. Enabling ThinLTO currently breaks this when a function containing a
`__shared__` variable is imported from another module. Even though the global
variable is imported along with its associated function, and the function is
privatized by the `EliminateAvailableExternallyPass`, the global variable itself
is not.

It's safe to privatize such global variables, because they're _local_ to their
associated functions. If the function itself is privatized, its associated
global variables should also be privatized accordingly.
@shiltian shiltian force-pushed the users/shiltian/convert-to-local-in-as branch from 1cfc903 to 9c643a2 Compare June 16, 2025 14:56
STATISTIC(NumRemovals, "Number of functions removed");
STATISTIC(NumConversions, "Number of functions converted");
STATISTIC(NumConversions, "Number of functions and globals converted");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it help if we split this statistic in NumFunctionsConverted and NumGlobalVariablesConverted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do

static bool eliminateAvailableExternally(Module &M, bool Convert) {
bool Changed = false;

// Convert global variables in specified address space before changing it to
// external linkage below.
if (ConvertGlobalVariableInAddrSpace.getNumOccurrences()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could this rather be done as part of the existing loop - for readability (the policies applied to GlobalVariables are in one place)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, although that means we will have to have the if-statement in each iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't see why that would be a problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just thought it might be less efficient.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt it - likely some loop optimization would handle that; if it appears on some compile-time benchmark, we can revisit.

@@ -88,9 +94,32 @@ static void convertToLocalCopy(Module &M, Function &F) {
++NumConversions;
}

static void convertToLocalCopy(Module &M, GlobalValue &GV) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe factor out the calculation and setting of the new name and linkage and reuse it at line 77

Copy link
Contributor

@teresajohnson teresajohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, the EliminateAvailableExternallyPass only converts certain available externally functions to local.

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd. Do you also want to convert functions to locals? If so, should this be under the same option(s)?

@shiltian
Copy link
Contributor Author

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd.

Thanks for the information. I'll correct the description.

Do you also want to convert functions to locals?

Yes but that is controlled by the existing option avail-extern-to-local.

If so, should this be under the same option(s)?

My understanding is avail-extern-to-local is for function and the new option is for global variable. Do you think it's a better idea to check both options for global variables?

@teresajohnson
Copy link
Contributor

This is not the default behavior. This pass mostly exists to drop definitions, not to convert to local. The conversion was added for contextual profiling. So the reference to this behavior in the summary is a bit odd.

Thanks for the information. I'll correct the description.

Do you also want to convert functions to locals?

Yes but that is controlled by the existing option avail-extern-to-local.

I guess there are 2 cases that we convert functions to locals, one is for contextual profiling (sets the Convert bool) and one is if the avail-extern-to-local flag is given.

If so, should this be under the same option(s)?

My understanding is avail-extern-to-local is for function and the new option is for global variable. Do you think it's a better idea to check both options for global variables?

I think it is fine to keep them separate for now, please just update the description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants