Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic example doesn't quite work #1

Open
carstenbauer opened this issue Jul 31, 2023 · 8 comments
Open

Basic example doesn't quite work #1

carstenbauer opened this issue Jul 31, 2023 · 8 comments
Labels
bug Something isn't working

Comments

@carstenbauer
Copy link
Member Author

See https://github.com/JuliaGPU/ROCTX.jl/tree/main/examples/roctx_test. The Julia variant runs just fine but doesn't seem to work, i.e., no marker/ranges sections appear in the results.json.

Note that a virtually identical C++ version produces the expected output on the same system.

@carstenbauer carstenbauer added the bug Something isn't working label Aug 1, 2023
@carstenbauer carstenbauer changed the title Examples Basic example doesn't quite work Aug 1, 2023
@carstenbauer
Copy link
Member Author

carstenbauer commented Aug 1, 2023

FWIW, adding ROCTX.roctracer_start() and ROCTX.roctracer_stop() doesn't seem to change anything.

@luraess
Copy link

luraess commented Aug 1, 2023

@matinraayai - may this be related to some LD_PRELOAD behaviour?

@carstenbauer
Copy link
Member Author

carstenbauer commented Aug 1, 2023

FWIW, I tried setting LD_PRELOAD similar to what they do for the original C++ example. Same result/problem.

@carstenbauer
Copy link
Member Author

carstenbauer commented Aug 2, 2023

FWIW, a colleague told me that he saw something similar (no traces in the output file) with C++ and MPI. He suggested I should try omniperf omnitrace.

@carstenbauer
Copy link
Member Author

carstenbauer commented Aug 2, 2023

Update on omnitrace:

omnitrace-instrument works fine for the simple C++ example (logfile: omnitrace_Cpp.txt):

Screenshot 2023-08-02 at 17 20 52

Doesn't work for Julia 😢 (logfile: omnitrace_Julia.txt). The critical line in the log is probably:

omnitrace-instrument: /home/omnitrace/external/dyninst/dyninstAPI/src/BPatch.C:752: void BPatch::registerForkedProcess(PCProcess*, PCProcess*): Assertion `parent' failed.

(cc @vchuravy, not because I think you'll work on this much but maybe you can translate "complicated log file" → "this is the problem" statement 😄 )

@matinraayai
Copy link

@carstenbauer I've never used roctx, but I've written my own AMD tool to capture HSA/HIP APIs, and it worked fine for @luraess's issue.
I'm suspecting some shenanigans going on when rocprof dlsyms the roctx callback register function which gets hidden from Julia. Unfortunately there's not a good logging mechanism going on to roctracer so I have to build roctx/rocprof with debug information to confirm that is indeed the case. I will get to it probably by the end of this week.

@matinraayai
Copy link

matinraayai commented Aug 8, 2023

@carstenbauer just as I suspected in the Julia version the RocTXLoader's dlsym handle is not loaded and returns nullptr. This means that the RocTX's callbacks are not registered in the first place, making your calls to the library (which are called correctly BTW) ineffective, even though your are storing a callback function for RocTX.
To give more background, rocprof itself is just a bash script setting environment variables to the following before launching your executable. In our case, they are:
HSA_TOOLS_LIB=/opt/rocm/lib/libroctracer64.so /opt/rocm/lib/libroctracer_tool.so
ROCTRACER_DOMAIN=roctx

For each tool loaded at startup, roctracer tool library provides the HSA API table to capture them and replace them with whatever they want, and do additional initialization. During this initialization process, the libroctracer64.so library is dlsymed by Roctracer to install different callbacks in different domains (HIP, HSA, RocTX, etc).

This work without any issue for C++, but doesn't work for Julia. The exact place that this happens is here.

I think Julia's load of this library for ccalls is interfering somehow with the library's intended dlsym. @vchuravy @carstenbauer thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants