Open
Description
I keep running into Segmentation fault
errors.
This happens most probably during calls to the OpenMM Python API.
Here is my "error message"
Stacktrace
[13736] signal 11 (1): Segmentation fault
in expression starting at none:0
_PyInterpreterState_GET at /usr/local/src/conda/python-3.12.8/Include/internal/pycore_pystate.h:133 [inlined]
get_state at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:866 [inlined]
_PyObject_Free at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:1850 [inlined]
PyObject_Free at /usr/local/src/conda/python-3.12.8/Objects/obmalloc.c:830
_buffer_info_free at /data/numerik/people/bzfsikor/conda/envs/conda_jl/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so (unknown line)
array_dealloc at /data/numerik/people/bzfsikor/conda/envs/conda_jl/lib/python3.12/site-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so (unknown line)
pydecref_ at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:118
pydecref at /data/numerik/people/bzfsikor/software/julia_depot/packages/PyCall/1gn3u/src/PyCall.jl:123
jfptr_pydecref_4550 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/PyCall/GkzkC_ddiUX.so (unknown line)
run_finalizer at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:299
jl_gc_run_finalizers_in_list at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:389
run_finalizers at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/gc.c:435
enable_finalizers at ./gcutils.jl:161 [inlined]
unlock at ./lock.jl:178 [inlined]
macro expansion at ./lock.jl:275 [inlined]
#282 at /home/htc/bzfsikor/.julia/juliaup/julia-1.11.3+0.x64.linux.gnu/share/julia/stdlib/v1.11/REPL/src/LineEdit.jl:2851
jfptr_YY.282_9263 at /data/numerik/people/bzfsikor/software/julia_depot/compiled/v1.11/REPL/u0gqU_dovaC.so (unknown line)
jl_apply at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/julia.h:2157 [inlined]
start_task at /cache/build/builder-demeter6-3/julialang/julia-release-1-dot-11/src/task.c:1202
Allocations: 775171170 (Pool: 775164164; Big: 7006); GC: 5349
fish: Job 1, 'env JULIA_HISTORY=./.history.jl…' terminated by signal SIGSEGV (Address boundary error)
I have no idea how to investigate this further.
Activity
[-]SegmentationFault[/-][+]Segmentation Fault[/+]axsk commentedon Jan 24, 2025
And here is another one, which I run into more often. This puzzles me especially since it somehow involves CUDA as well..
Stacktrace
axsk commentedon Jan 24, 2025
I switched to a single threaded instance and have not yet observed this issue.
However, I don't make any (explicit) use of multi-threading nowhere in my code..
axsk commentedon Mar 4, 2025
Whereas above examples happened randomly in my training loop, I can now reproduce the problem by tab-completing a PyObjets fields:
Stacktrace
Again, this only happens with multiple threads, when starting Julia with
--threads=1
it works fine.bhawkins commentedon Apr 8, 2025
Here's a simple script that reliably triggers a similar segfault for me:
Output on Linux with Julia 1.11.4 and Python 3.12.9
Output on macos with Julia 1.11.3 and Python 3.13.2
On my mac this crashes about 90% of the time. If I comment out the
pyimport
then the probability of crashing goes down somewhat, maybe 70%. If I comment both the first two lines then it doesn't crash. I'm not sure how important the function body is, but it doesn't seem to crash if I have a parallel loop with justsleep(1)
in it.bhawkins commentedon Apr 8, 2025
Using the
pylock()
suggestion here makes the crash go away. My example isn't actually calling any Python, but I guess gc can run anyway.