Q: when calling Python script from D, how to properly setup env (for loading dynamical libraries .so)? #156

mw66 · 2021-05-24T23:23:50Z

Hi,

I encountered a strange error, my Python program behaves differently between stand-alone run v.s called by D program via pyd, I noticed it could caused by the loading dynamical libraries .so differently:

Python stand-alone, run log:
"""
2021-05-24 18:48:06.939476: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-05-24 18:48:06.958735: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3492480000 Hz
2021-05-24 18:48:07.603718: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-05-24 18:48:10.760915: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-05-24 18:48:10.763691: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
"""
please note: libcublas.so.11 is loaded first, and the run succeeds.

When the same Python script called by pyd, the run log is:
"""
2021-05-24 19:03:30.943183: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-05-24 19:03:31.098807: I tensorflow/core/platform/profile_utils/cpu_utils.cc:112] CPU Frequency: 3492480000 Hz
[New Thread 0x7ffe397fa700 (LWP 23546)]
[Thread 0x7ffe397fa700 (LWP 23546) exited]
[New Thread 0x7ffe397fa700 (LWP 23547)]
[New Thread 0x7ffe38ff9700 (LWP 23548)]
[New Thread 0x7ffe39ffb700 (LWP 23549)]
[New Thread 0x7ffcf3fff700 (LWP 23550)]
2021-05-24 19:03:35.463795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11
2021-05-24 19:03:38.561243: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
2021-05-24 19:03:57.926635: E tensorflow/stream_executor/cuda/cuda_blas.cc:197] failed to set new cublas math mode: CUBLAS_STATUS_INVALID_VALUE
Traceback (most recent call last):
"""
please note: libcublas.so.11 is NOT loaded, and the 1st load become libcublasLt.so.11; and then the run fails.

I tried very hard to make sure that at shell command level, I'm setting the same env vars in the two scenarios.

But why the Python program called by pyd from D skip loading some dynamic library (i.e. libcublas.so.11 in this case)?

Is there something (env var) I need to setup in the D program when calling pyd?

(Another thing that looks suspicious is: there are some thread activity going on before loading those library, this only happened in the pyd run, not sure if it's related).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q: when calling Python script from D, how to properly setup env (for loading dynamical libraries .so)? #156

Q: when calling Python script from D, how to properly setup env (for loading dynamical libraries .so)? #156

mw66 commented May 24, 2021

Q: when calling Python script from D, how to properly setup env (for loading dynamical libraries .so)? #156

Q: when calling Python script from D, how to properly setup env (for loading dynamical libraries .so)? #156

Comments

mw66 commented May 24, 2021