Skip to content

Commit ea1a0ce

Browse files
committed
Add address sanitizer build options
1 parent a1a0560 commit ea1a0ce

File tree

3 files changed

+64
-0
lines changed

3 files changed

+64
-0
lines changed

CMakeLists.txt

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@ endif()
3939
###############################################################################
4040
option(DEBUG "Enable debug trace" OFF)
4141
option(PROFILE "Enable statistics and timing support" OFF)
42+
option(ASAN "Enable address sanitizer" OFF)
4243
option(USE_RO "Enable RO conduit." ON)
4344
option(USE_IPC "Enable IPC support (using HIP)" OFF)
4445
option(USE_THREADS "Enable workgroup threads to share network queues" OFF)
@@ -132,6 +133,11 @@ if (NOT BUILD_TESTS_ONLY)
132133
set(GPU_TARGETS "${DEFAULT_GPUS}" CACHE STRING
133134
"Target default GPUs if GPU_TARGETS is not defined.")
134135

136+
if (ASAN)
137+
message(STATUS "Adding xnack+ to GPU_TARGET to enable ASAN address sanitizer")
138+
list(TRANSFORM GPU_TARGETS APPEND ":xnack+" REGEX "[^+]$")
139+
endif()
140+
135141
if (COMMAND rocm_check_target_ids)
136142
message(STATUS "Checking for ROCm support for GPU targets: " "${GPU_TARGETS}")
137143
rocm_check_target_ids(SUPPORTED_GPUS TARGETS ${GPU_TARGETS})
@@ -182,6 +188,18 @@ if (NOT BUILD_TESTS_ONLY)
182188
hip::host
183189
hsa-runtime64::hsa-runtime64
184190
)
191+
192+
target_link_options(
193+
${PROJECT_NAME}
194+
PUBLIC
195+
$<$<BOOL:${ASAN}>:-fsanitize=address -g -shared-libsan>
196+
)
197+
198+
target_compile_options(
199+
${PROJECT_NAME}
200+
PUBLIC
201+
$<$<BOOL:${ASAN}>:-fsanitize=address -g>
202+
)
185203
endif()
186204

187205
###############################################################################

DEBUG.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
Hacking and Debugging RocSHMEM
2+
==============================
3+
4+
This documentation is mostly for core RocSHMEM developpers. Power users may find it useful.
5+
6+
How to debug parallel programs
7+
------------------------------
8+
9+
When using Open MPI as the launch mechanism, you can use `OMPI_MCA_mpi_abort_delay=-1` to keep parallel processes active after a crash. You can then use `ssh -t $nodename rocgdb -p $pid` to connect a `rocgdb` to the failed process. Sometimes errors are caught by UCX, `UCX_HANDLE_ERRORS=freeze` will have the same effect in such cases.
10+
11+
Look into `scripts/functional_tests/GDB_README` about an alternative technique that deploys multiple `xterm` to gdb into parallel processes.
12+
13+
How to use the address sanitizer (ASAN)
14+
---------------------------------------
15+
16+
Refer to [General documentation for ASAN on AMD GPUs][1].
17+
18+
### Compiling with ASAN
19+
20+
If this is a fresh build directory, simply add `-DASAN=ON` to the `cmake` invocation.
21+
`cmake . <...> -DASAN=ON`
22+
23+
If you are enabling ASAN in a previously used build directory, use `ccmake` to alter the CMake Cache
24+
`ccmake .`
25+
26+
In the `ccmake` interface:
27+
1. find and toggle `ASAN` ON
28+
2. find and delete `COMPILING_TARGETS` (keybind `d`)
29+
30+
Do not forget to delete `COMPILING_TARGETS` again when disabling ASAN (otherwise xnack will remain active, impacting performance).
31+
32+
### Running with ASAN
33+
34+
You may need to add path to `libclang_rt.asan-x86_64.so` to `LD_LIBRARY_PATH` by hand. Depending on the ROCm version, it may be in an unusual place, e.g., `$ROCM_ROOT/lib/llvm/lib/clang/19/lib/linux/libclang_rt.asan-x86_64.so`; `find /opt/rocm -name libclang_rt.asan-x86_64.so` may be required to find it.
35+
36+
ASAN may [crash when using Open MPI][2]. If that happensi, set environment variable `OMPI_MCA_memory=^patcher`. Do not forget to unset this variable when not using ASAN (it will impact performance).
37+
38+
When running the program, the behavior of ASAN can be controlled with the `ASAN_OPTIONS` environment variable.
39+
40+
41+
42+
### References
43+
44+
[1]: https://rocm.docs.amd.com/projects/llvm-project/en/docs-6.4.0/conceptual/using-gpu-sanitizer.html
45+
[2]: https://github.com/open-mpi/ompi/issues/13069

scripts/lsan-suppressions.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
leak:pmix

0 commit comments

Comments
 (0)