Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module clone pass. #202

Merged
merged 1 commit into from
Oct 15, 2024
Merged

Module clone pass. #202

merged 1 commit into from
Oct 15, 2024

Conversation

Pavel-Durov
Copy link

@Pavel-Durov Pavel-Durov commented Sep 14, 2024

YkModuleClone LLVM pass, which creates duplicate versions of functions within a module.

Key changes

Function Cloning

  • Functions that do not have their addresses taken are cloned. This results in two versions of the same function within the module: the original and the cloned version. Cloned functions are renamed by appending a YK_CLONE_PREFIX to their original names. It is assumed that each function has unique name within the module.

Example of original and cloned functions in a module

func func_example(%arg0: i32) -> i32 {
  bb0:
....
func __yk_clone_func_example(%arg0: i32) -> i32 {
  bb0:
...
  • Functions whose addresses are taken are left unchanged and are not cloned, ensuring their original version remains in the module.

Module Linking

After cloning, the pass uses llvm::Linker::linkModules() to link the cloned module back into the original. It uses the OverrideFromSrc flag to prefer function definitions from the cloned module if there are conflicts.

@Pavel-Durov Pavel-Durov mentioned this pull request Sep 14, 2024
@ltratt
Copy link

ltratt commented Sep 14, 2024

We will need some (at least basic) tests before we can think about merging this: without those, it's actually a bit hard to know what's going on!

@ltratt
Copy link

ltratt commented Sep 14, 2024

Do we need to close this? Can't we just add more commits?

@Pavel-Durov
Copy link
Author

Do we need to close this? Can't we just add more commits?

no... I clicked on the the close button by mistake...

@Pavel-Durov
Copy link
Author

We will need some (at least basic) tests before we can think about merging this: without those, it's actually a bit hard to know what's going on!

Ok, will add similar tests to llvm-lit llvm/test/Transforms/Yk/BasicBlockTracer.ll

@Pavel-Durov Pavel-Durov changed the title Module clone Pass. Module clone pass. Sep 15, 2024
@Pavel-Durov
Copy link
Author

Pavel-Durov commented Sep 15, 2024

Added ModuleClone.ll test 👉 78a13d1

llvm-lit ./llvm/test/Transforms/Yk/*
              0   0%    0.00kB/s    0:00:00 (xfr#0, to-chk=0/150768)   
-- Testing: 3 tests, 3 workers --
PASS: LLVM :: Transforms/Yk/ModuleClone.ll (1 of 3)
PASS: LLVM :: Transforms/Yk/BasicBlockTracer.ll (2 of 3)
PASS: LLVM :: Transforms/Yk/Linkage/YkLinkage.ll (3 of 3)

@Pavel-Durov Pavel-Durov marked this pull request as ready for review September 15, 2024 13:32
@ltratt
Copy link

ltratt commented Sep 15, 2024

Is there any way in LLVM to generate "fresh" names (maybe using name mangling?) rather than __yk_clone? Or making the name an argument rather than hard coded? [Maybe this is hard, I dunno.]

@vext01
Copy link

vext01 commented Sep 16, 2024

I think it will do it automatically. If I recall it will start suffixing things .0, .1 etc.

From the top of my head it already does this kind of thing by itself during optimisation passes that require symbols to be copied, so I'd stick with a "well known yk-specific prefix" to avoid accidentally calling things we don't mean to.

}

bool runOnModule(Module &M) override {
if (M.getName() == "src/perf/collect.c") {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the story with this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it was an issue, but I can't reproduce it anymore
removed 👉 d223b64

updateClonedGlobals(*Cloned);

llvm::Linker TheLinker(M);
if (TheLinker.linkInModule(std::move(Cloned))) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah hah! So you found a bit to do the merging for you. What exactly does this call do?

Copy link
Author

@Pavel-Durov Pavel-Durov Sep 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call attempts to link/merge the both modules std::move(Cloned) moves ownership of the cloned module to the linker, meaning that after this call the cloned module is no longer valid in the original context.
If the modules are successfully linked, the function returns false (curious choise). If there's an error (such as conflicting function or global names that cannot be resolved), it returns true.

https://llvm.org/doxygen/classllvm_1_1Linker.html#a5a864cc4d71e0234e9f8d786f8716804

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to be using the static linkModules function, they do the same thing but linkModules feels more intuitive to me :)
👉 4a77e95

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the linkInModule function to use Linker::Flags::OverrideFromSrc it helps with resolving conflicts, for example when we don't want to clone the functions since its address has been taken.
Additionally, it removed the need to handle globals as they will be overridden based on source definition.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which commit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one 👉 54bc3b9

Tests updated here 👉 6930afa

@vext01
Copy link

vext01 commented Sep 16, 2024

I'd stick with a "well known yk-specific prefix"

I imagine this will also aid debugging: knowing if we cloned it, or something else did.

@Pavel-Durov
Copy link
Author

I think it will do it automatically. If I recall it will start suffixing things .0, .1 etc.

From the top of my head it already does this kind of thing by itself during optimisation passes that require symbols to be copied, so I'd stick with a "well known yk-specific prefix" to avoid accidentally calling things we don't mean to.

Yes, LLVM automatically appends suffixes to functions with the same name, such as in cases of overloading. This ensures that function names within an LLVM module are unique. Therefore, cloning and renaming functions with a "static" yk suffix should be safe as well.

@Pavel-Durov
Copy link
Author

@vext01 as discussed, I removed the stackmaps from the cloned module (this commit)@vext01 as discussed, I removed the stackmaps from the cloned module (this commit 👉 5c5740b).

Example of the original basicblock:

20:                                               ; preds = %18
  %21 = load i32, ptr %3, align 4, !dbg !60
  call void @llvm.dbg.value(metadata i32 %21, metadata !48, metadata !DIExpression()), !dbg !27
  %22 = getelementptr i8, ptr %3, i32 16
  store ptr %22, ptr @shadowstack_0, align 8, !dbg !61
  %23 = call i32 @decrement(i32 noundef %21), !dbg !61
  call void (i64, i32, ...) @llvm.experimental.stackmap(i64 2, i32 0, ptr %3, ptr %4, ptr %5, i32 %21), !dbg !27
  br label %24, !dbg !27

Example of the cloned basicblock:

20:                                               ; preds = %18
  %21 = load i32, ptr %3, align 4, !dbg !100
  call void @llvm.dbg.value(metadata i32 %21, metadata !90, metadata !DIExpression()), !dbg !76
  %22 = getelementptr i8, ptr %3, i32 16
  store ptr %22, ptr @shadowstack_0, align 8, !dbg !101
  %23 = call i32 @__yk_clone_decrement(i32 noundef %21), !dbg !101
  br label %24, !dbg !76

As expected I get assertion error in YkIRWriter.cpp:

llvm/lib/YkIR/YkIRWriter.cpp:585: void {anonymous}::YkIRWriter::serialiseStackmapCall(llvm::CallInst*, {anonymous}::FuncLowerCtxt&): Assertion `I' failed.

@vext01
Copy link

vext01 commented Sep 27, 2024

I think this is because the serialiser expects to find stackmaps for branches and calls, which for the cloned module, there are none.

The quick fix is to not serialise IR for the cloned modules.

// `YkModuleClone` needs to run before `YkBasicBlockTracerPass` and
// `YkInsertStackMaps` because they both skip cloned functions.
if (YkModuleClone) {
addPass(createYkModuleClonePass());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a bad spot for the cloning pass. But I think we might want to run it even earlier before the block-splitting and no calls in entry blocks pass (and maybe also before the blockdisambiguate pass). Though that means we need to manually add the control point and a single stackmap for that control point into the cloned module. We need those in order to jump between the two modules.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.
Made changes here 👉 c1225df.
I think it actually works by reusing the same YkStackmaps pass with different configurations for stackmapIDStart - 1 by default, 2 in case of YkModuleClone is active.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent. Does this still insert stackmaps into the cloned module? The cloned module only needs a single stackmap at the control point. Though, while important, it's kind of an optimisation, so you may choose to do this in another PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, the the patchpoint emits stackmaps.
For the cloned module there's a patchpoint in the original main function

18:                                               ; preds = %15
  call void @__yk_trace_basicblock(i32 2, i32 8), !dbg !57
  %19 = load ptr, ptr %4, align 8, !dbg !57
  call void @llvm.dbg.value(metadata ptr %19, metadata !32, metadata !DIExpression()), !dbg !27
  call void (i64, i32, ptr, i32, ...) @llvm.experimental.patchpoint.void(i64 0, i32 13, ptr @__ykrt_control_point, i32 3, ptr %19, ptr %3, i64 0, ptr %3, ptr %4, ptr %5, ptr %19), !dbg !59
  br label %20, !dbg !60

and cloned main function:

18:                                               ; preds = %15
  %19 = load ptr, ptr %4, align 8, !dbg !97
  call void @llvm.dbg.value(metadata ptr %19, metadata !80, metadata !DIExpression()), !dbg !76
  call void (i64, i32, ptr, i32, ...) @llvm.experimental.patchpoint.void(i64 1, i32 13, ptr @__ykrt_control_point, i32 3, ptr %19, ptr %3, i64 1, ptr %3, ptr %4, ptr %5, ptr %19), !dbg !99
  br label %20, !dbg !100

@Pavel-Durov
Copy link
Author

I think this is because the serialiser expects to find stackmaps for branches and calls, which for the cloned module, there are none.

The quick fix is to not serialise IR for the cloned modules.

Done here 👉 d73b87c, (and later patched with e5c2e43, 2e837df)

CallInst *findControlPointCall(Module &M) {
/// Find the calls to the dummy control point that we want to patch.
/// Returns either a pointer the a vector of instructions
std::vector<CallInst *> findControlPointCalls(Module &M) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we have two control points in both original and the cloned main functions.

main (stack map id: 0):

21:                                               ; preds = %18
  call void @__yk_trace_basicblock(i32 0, i32 9), !dbg !70
  %22 = load ptr, ptr %4, align 8, !dbg !70
  call void @llvm.dbg.value(metadata ptr %22, metadata !41, metadata !DIExpression()), !dbg !36
  call void (i64, i32, ptr, i32, ...) @llvm.experimental.patchpoint.void(i64 0, i32 13, ptr @__ykrt_control_point, i32 3, ptr %22, ptr %3, i64 0, ptr %3, ptr %4, ptr %5, ptr %6, ptr %22), !dbg !72
  br label %23, !dbg !73

__yk_clone_main (stack map id: 1):

21:                                               ; preds = %18
  %22 = load ptr, ptr %4, align 8, !dbg !113
  call void @llvm.dbg.value(metadata ptr %22, metadata !93, metadata !DIExpression()), !dbg !89
  call void (i64, i32, ptr, i32, ...) @llvm.experimental.patchpoint.void(i64 1, i32 13, ptr @__ykrt_control_point, i32 3, ptr %22, ptr %3, i64 1, ptr %3, ptr %4, ptr %5, ptr %6, ptr %22), !dbg !115
  br label %23, !dbg !116

****

@Pavel-Durov
Copy link
Author

I think this is because the serialiser expects to find stackmaps for branches and calls, which for the cloned module, there are none.

I had to update the header as well 👉 934af15

@@ -1742,7 +1743,10 @@ class YkIRWriter {
OutStreamer.emitSizeT(M.size());
// funcs:
for (llvm::Function &F : M) {
serialiseFunc(F);
// Skip cloned functions

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it enough to just skip the cloned functions? What about the rest of the module, e.g. M.size() above. Is this still serialised somewhere? Do we not need to skip this in the de-serialiser?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that in a later commit you updated how M.size() is calculated. But now I'm confused. Do the cloned functions live in the same module as the original? Or do we have two separate modules?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I got confused here, but I thought that we don't need the cloned functions in the IR?
#202 (comment)

In the current version of this PR both function versions (original and cloned, excluding the functions that have reference taken to them) live in the same module, but we serialize only the non-cloned version in YkIRWriter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see. I think that answers my question. Though now I'm curious how we are going to optimise one version of the function while not optimising the other. That's something we need to think about soon.

@Pavel-Durov
Copy link
Author

Finally, I figured out that the serialisation/deserialisation issue was caused by a misalignment of function indexes (actually noted in the FIXME comment) 👉 bd74adf

With this fix all tests are passing in the yk using SWT and the Clone Pass 💪

@Pavel-Durov
Copy link
Author

This PR is ready for another review.
Summary of changes:

  • Clone module functions that have no address taken to them
  • Insert control point + stackmap in the clone - when this pass is enabled we will have two control points.
  • Skip cloned functions serialisation in YkIRWriter.


#include "llvm/Pass.h"

#define YK_CLONE_PREFIX "__yk_clone_"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document somewhere (maybe in a doc comment at the top of the pass cpp file?) what the cloned versions are? Are they intended to be the optimised versions, or the unoptimisaed versions?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, will add.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added doc comment 👉 71f1973

// definitions from the source module (the second argument) when
// conflicts arise. This means that if two functions have the same
// name in both the original and cloned modules, the version from
// the cloned module will overwrite the original.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that the "deleting the initialiser" trick is no longer needed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, we don't need it, linkModules resolves these conflicts for us. #202 (comment)
Updated comment to make it clearer 👉 9d19ee9

@vext01
Copy link

vext01 commented Oct 11, 2024

Looking good! Just a handful of comments from me.

@Pavel-Durov
Copy link
Author

Looking good! Just a handful of comments from me.

Thanks, will address 👍

@vext01
Copy link

vext01 commented Oct 14, 2024

I think this looks good, bar some minor bits you can fix at squash-time.

@ptersilie would you mind glancing over this in case I've missed anything?

// original names. This distinguishes them from the original functions.
//
// - **Clone Process:**
// - 1. The orignal model is cloned, creating two copies of the module.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be s/orignal model/original module/ ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 👉 86efe30

}

// This pass duplicates functions within the module:
// the original and the cloned version.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line seems redundant?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated 👉 5f2716d

@ptersilie
Copy link

Apart from some typos and odd formatting in the comments (which hopefully yk_format_new_files.sh will sort out), this looks good to me.

@Pavel-Durov
Copy link
Author

Apart from some typos and odd formatting in the comments (which hopefully yk_format_new_files.sh will sort out), this looks good to me.

What is odd with the formatting?
When running yk_format_new_files.sh I get this diff 👉 5388c52

@vext01
Copy link

vext01 commented Oct 14, 2024

(Formatters often don't format comments)

@vext01
Copy link

vext01 commented Oct 14, 2024

I'm going to let @ptersilie take it from here, since his comments are the only ones remaining.

@Pavel-Durov
Copy link
Author

Comment format updated 👉 f43ee77

@ptersilie
Copy link

Let's squash this.

Module cloning allows us to execute multiple versions of the same
program. Cloned functions can be optimized without affecting the
original ones used for tracing. This approach enables us to run
optimised code when tracing is not needed, while still preserving
tracing functionality.
@Pavel-Durov Pavel-Durov force-pushed the clone-llvm-module-pass branch from f43ee77 to 64e9c76 Compare October 15, 2024 17:50
@Pavel-Durov
Copy link
Author

squashed 👉 64e9c76

@vext01 vext01 added this pull request to the merge queue Oct 15, 2024
Merged via the queue into ykjit:main with commit 3ff37cb Oct 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants