Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault after #3821 #3856

Open
wujingyue opened this issue Feb 8, 2025 · 9 comments
Open

Segfault after #3821 #3856

wujingyue opened this issue Feb 8, 2025 · 9 comments
Assignees

Comments

@wujingyue
Copy link
Collaborator

wujingyue commented Feb 8, 2025

I merged #3821 too quickly. The CI indeed showed the same error.

To reproduce this,

$ _bn && DEBUG_SERDE=debug pytest tests/python/test_python_frontend.py -s
Caught signal 11 (Segmentation fault: invalid permissions for mapped object at address 0x2a420588)
==== backtrace (tid:1495132) ====
 0  /usr/local/ucx/lib/libucs.so.0(ucs_handle_error+0x2e4) [0x79a4e79ae614]
 1  /usr/local/ucx/lib/libucs.so.0(+0x3680c) [0x79a4e79ae80c]
 2  /usr/local/ucx/lib/libucs.so.0(+0x36a48) [0x79a4e79aea48]
 3  [0x2a420588]
=================================
Fatal Python error: Segmentation fault

Current thread 0x000079a4e9ab5300 (most recent call first):
  File "/opt/pytorch/nvfuser/nvfuser/__init__.py", line 73 in segment
  File "/opt/pytorch/nvfuser/tests/python/utils.py", line 268 in check_cpp_translation
  File "/opt/pytorch/nvfuser/tests/python/utils.py", line 477 in exec_nvfuser
  File "/opt/pytorch/nvfuser/tests/python/utils.py", line 410 in inner_fn
  File "/opt/pytorch/nvfuser/tests/python/test_python_frontend.py", line 2956 in test_issue1273
  File "/usr/local/lib/python3.12/dist-packages/torch/testing/_internal/common_utils.py", line 3099 in wrapper
  File "/usr/lib/python3.12/unittest/case.py", line 589 in _callTestMethod
  File "/usr/lib/python3.12/unittest/case.py", line 634 in run
  File "/usr/local/lib/python3.12/dist-packages/torch/testing/_internal/common_utils.py", line 3206 in _run_custom
  File "/usr/local/lib/python3.12/dist-packages/torch/testing/_internal/common_utils.py", line 3234 in run
  File "/usr/lib/python3.12/unittest/case.py", line 690 in __call__
  File "/usr/local/lib/python3.12/dist-packages/_pytest/unittest.py", line 321 in runtest
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 172 in pytest_runtest_call
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 240 in <lambda>
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 340 in from_call
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 239 in call_and_report
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 134 in runtestprotocol
  File "/usr/local/lib/python3.12/dist-packages/_pytest/runner.py", line 115 in pytest_runtest_protocol
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/lib/python3.12/dist-packages/_pytest/main.py", line 364 in pytest_runtestloop
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/lib/python3.12/dist-packages/_pytest/main.py", line 339 in _main
  File "/usr/local/lib/python3.12/dist-packages/_pytest/main.py", line 285 in wrap_session
  File "/usr/local/lib/python3.12/dist-packages/_pytest/main.py", line 332 in pytest_cmdline_main
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_callers.py", line 103 in _multicall
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_manager.py", line 120 in _hookexec
  File "/usr/local/lib/python3.12/dist-packages/pluggy/_hooks.py", line 513 in __call__
  File "/usr/local/lib/python3.12/dist-packages/_pytest/config/__init__.py", line 174 in main
  File "/usr/local/lib/python3.12/dist-packages/_pytest/config/__init__.py", line 197 in console_main
  File "/usr/local/bin/pytest", line 8 in <module>

Extension modules: numpy.core._multiarray_umath, numpy.core._multiarray_tests, numpy.linalg._umath_linalg, numpy.fft._pocketfft_internal, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, torch._C, torch._C._dynamo.autograd_compiler, torch._C._dynamo.eval_frame, torch._C._dynamo.guards, torch._C._dynamo.utils, torch._C._fft, torch._C._linalg, torch._C._nested, torch._C._nn, torch._C._sparse, torch._C._special, jaxlib.cpu_feature_guard, psutil._psutil_linux, psutil._psutil_posix (total: 27)
[1]    1495132 segmentation fault (core dumped)  DEBUG_SERDE=debug pytest tests/python/test_python_frontend.py -s
@wujingyue wujingyue changed the title Segfault after https://github.com/NVIDIA/Fuser/pull/3821 Segfault after #3821 Feb 8, 2025
@cowanmeg
Copy link
Collaborator

Will look into this!

@wujingyue
Copy link
Collaborator Author

Thank you!

@cowanmeg
Copy link
Collaborator

Hmm... this bug is a bit strange since the segfault occurs in ucx. Just confirming these python_frontend tests only test single device behavior, right?

@wujingyue
Copy link
Collaborator Author

these python_frontend tests only test single device behavior, right?

That's right!

@cowanmeg
Copy link
Collaborator

cowanmeg commented Mar 7, 2025

Hmm.. I have tried to reproduce this multiple times, but the failing test runs fine locally.

@wujingyue
Copy link
Collaborator Author

What a stubborn bug unfortunately...

Just to think aloud:

https://gitlab-master.nvidia.com/dl/pytorch/fuser-gh-mirror/-/jobs/147631713/viewer#L2777 shows

[027709286b51:3187 :0:3187] Caught signal 11 (Segmentation fault: address not mapped to object at address 0xfffffffffffffff0)

So apparently a nullptr occurred without being caught immediately, causing the code to apply offset -0x10 on it and then dynamic_cast.

The callstack points to this cloning. So it might help to add more nullptr checks in that function or the functions it calls.

@wujingyue
Copy link
Collaborator Author

@cowanmeg
Copy link
Collaborator

Thanks for the pointers! I'll try out the asan

@cowanmeg
Copy link
Collaborator

cowanmeg commented Mar 14, 2025

As expected, there is a memory bug (using a free'd heap pointer) that existed before the PR which happened to surface the bug.

To reproduce:

  1. Build nvfuser with address sanitizer
  2. DEBUG_SERDE=debug ASAN_OPTIONS=protect_shadow_gap=0 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libasan.so.8 pytest tests/python/test_python_frontend.py -s -k 1273

with some print statements, it looks like the error happens when we try to clone a TensorView that had fallen out of scope. Since the original data was free'd from a unique pointer, there must be a bug where we saved a raw pointer and kept using it.

IrBuilder::clone T1_l_float[?S4{2}rf, ?S5{1}rf, ?S6{2}rf]
TensorView::TensorView: Created via IRCloner from T1_l_float[?S4{2}rf, ?S5{1}rf, ?S6{2}rf] src: 0x514000c10240, dest: 0x514000cfe040 
  IrBuilder::clone src: T1_l_float[?S4{2}rf, ?S5{1}rf, ?S6{2}rf] to dest: T4294967295_l_float[?S4{2}rf, ?S5{1}rf, ?S6{2}rf]
  src 0x514000c10240, dest 0x514000cfe040
  Registering clone 0x514000cfe040

IrCloner::clone new node 0x514000cfe040 // 
IrCloner::handle 0x514000cfe040 // attempt to dereference pointer to call clone
=================================================================
==86448==ERROR: AddressSanitizer: heap-use-after-free on address 0x514000cfe040 at pc 0x7fefad050740 bp 0x7ffe403971a0 sp 0x7ffe40397190
READ of size 8 at 0x514000cfe040 thread T0
    #0 0x7fefad05073f in nvfuser::IrCloner::handle(nvfuser::Statement const*) /opt/pytorch/Fuser/csrc/ir/cloner.cpp:49
    #1 0x7fefad051a91 in nvfuser::IrCloner::clone(nvfuser::Statement const*) /opt/pytorch/Fuser/csrc/ir/cloner.cpp:29
    #2 0x7fefac966d3f in nvfuser::TensorView* nvfuser::IrCloner::clone<nvfuser::TensorView>(nvfuser::TensorView const*) /opt/pytorch/Fuser/csrc/ir/cloner.h:63
    #3 0x7fefac966d3f in nvfuser::DynamicTransformInitialInfo::clone(nvfuser::IrCloner&) const /opt/pytorch/Fuser/csrc/dynamic_transform.cpp:34
    #4 0x7fefad941825 in operator() /opt/pytorch/Fuser/csrc/runtime/fusion_executor_cache.cpp:696
    #5 0x7fefad941b0a in __invoke_impl<std::any, nvfuser::FusionExecutorCache::initialInfo()::<lambda(nvfuser::IrCloner&, std::any)>&, nvfuser::IrCloner&, std::any> /usr/include/c++/13/bits/invoke.h:61
    #6 0x7fefad941b0a in __invoke_r<std::any, nvfuser::FusionExecutorCache::initialInfo()::<lambda(nvfuser::IrCloner&, std::any)>&, nvfuser::IrCloner&, std::any> /usr/include/c++/13/bits/invoke.h:116
    #7 0x7fefad941b0a in _M_invoke /usr/include/c++/13/bits/std_function.h:291
    #8 0x7fefacbec3de in std::function<std::any (nvfuser::IrCloner&, std::any)>::operator()(nvfuser::IrCloner&, std::any) const /usr/include/c++/13/bits/std_function.h:591
    #9 0x7fefacbec3de in nvfuser::Fusion::copy(nvfuser::Fusion const*, nvfuser::Fusion*) /opt/pytorch/Fuser/csrc/fusion.cpp:95
    #10 0x7fefaccd7d1e in nvfuser::TranslateApplicableWelford::wouldTranslateToPersistent(std::vector<nvfuser::WelfordOp*, std::allocator<nvfuser::WelfordOp*> > const&, nvfuser::SegmentedGroup*) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:2891
    #11 0x7fefaccda9d5 in nvfuser::TranslateApplicableWelford::TranslateApplicableWelford(nvfuser::Fusion*, nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:2781
    #12 0x7fefaccdafc0 in nvfuser::TranslateApplicableWelford::run(nvfuser::Fusion*, nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:2720
    #13 0x7fefaccdafc0 in nvfuser::SegmentCandidateFinder::translateWelfordInFusion(nvfuser::Fusion*, nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:3052
    #14 0x7fefacce09f8 in nvfuser::SegmentedFusion::fromCompleteFusion(std::unique_ptr<nvfuser::Fusion, std::default_delete<nvfuser::Fusion> >, nvfuser::SchedulerType, nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:443
    #15 0x7fefaccee3e9 in nvfuser::SegmentCandidateFinder::segment(std::unique_ptr<nvfuser::Fusion, std::default_delete<nvfuser::Fusion> >, nvfuser::KernelArgumentHolder const&, nvfuser::SchedulerRuntimeInfo&) /opt/pytorch/Fuser/csrc/fusion_segmenter.cpp:2031
    #16 0x7fefae11e6dc in nvfuser::python_frontend::SegmentationState::setupSegmentation(nvfuser::Fusion*, std::unordered_map<nvfuser::Val const*, long, std::hash<nvfuser::Val const*>, std::equal_to<nvfuser::Val const*>, std::allocator<std::pair<nvfuser::Val const* const, long> > > const&, nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/python_frontend/segmentation.cpp:90
    #17 0x7fefae0df4a6 in nvfuser::python_frontend::FusionDefinition::setupSegmentation(nvfuser::KernelArgumentHolder const&) /opt/pytorch/Fuser/csrc/python_frontend/fusion_definition.cpp:803
    #18 0x7fefac02d018 in operator() /opt/pytorch/Fuser/csrc/python_frontend/python_bindings.cpp:1170
    #19 0x7fefac09cc2e in call_impl<long int, nvfuser::python_frontend::initNvFuserPythonBindings(PyObject*)::<lambda(nvfuser::python_frontend::FusionDefinition&, const pybind11::iterable&)>&, 0, 1, pybind11::detail::void_type> /usr/local/lib/python3.12/dist-packages/torch/include/pybind11/cast.h:1631
    #20 0x7fefac09cc2e in call<long int, pybind11::detail::void_type, nvfuser::python_frontend::initNvFuserPythonBindings(PyObject*)::<lambda(nvfuser::python_frontend::FusionDefinition&, const pybind11::iterable&)>&> /usr/local/lib/python3.12/dist-packages/torch/include/pybind11/cast.h:1599
    #21 0x7fefac09cc2e in operator() /usr/local/lib/python3.12/dist-packages/torch/include/pybind11/pybind11.h:278
    #22 0x7fefac09cc2e in _FUN /usr/local/lib/python3.12/dist-packages/torch/include/pybind11/pybind11.h:249
    #23 0x7fefabfe09f4 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /usr/local/lib/python3.12/dist-packages/torch/include/pybind11/pybind11.h:971
    #24 0x58208e  (/usr/bin/python3.12+0x58208e) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #25 0x549184 in _PyObject_MakeTpCall (/usr/bin/python3.12+0x549184) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #26 0x5d73c8 in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5d73c8) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #27 0x54cd31  (/usr/bin/python3.12+0x54cd31) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #28 0x5db55a in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5db55a) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #29 0x54cd93  (/usr/bin/python3.12+0x54cd93) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #30 0x54b3b4 in PyObject_Call (/usr/bin/python3.12+0x54b3b4) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #31 0x5db55a in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5db55a) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #32 0x54aa99 in _PyObject_Call_Prepend (/usr/bin/python3.12+0x54aa99) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #33 0x5a3627  (/usr/bin/python3.12+0x5a3627) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #34 0x54924d in _PyObject_MakeTpCall (/usr/bin/python3.12+0x54924d) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #35 0x5d73c8 in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5d73c8) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #36 0x54aa99 in _PyObject_Call_Prepend (/usr/bin/python3.12+0x54aa99) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #37 0x5a3627  (/usr/bin/python3.12+0x5a3627) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #38 0x54b30b in PyObject_Call (/usr/bin/python3.12+0x54b30b) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #39 0x5db55a in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5db55a) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #40 0x54aa99 in _PyObject_Call_Prepend (/usr/bin/python3.12+0x54aa99) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #41 0x5a3627  (/usr/bin/python3.12+0x5a3627) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #42 0x54924d in _PyObject_MakeTpCall (/usr/bin/python3.12+0x54924d) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #43 0x5d73c8 in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5d73c8) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #44 0x54aa99 in _PyObject_Call_Prepend (/usr/bin/python3.12+0x54aa99) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #45 0x5a3627  (/usr/bin/python3.12+0x5a3627) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #46 0x54924d in _PyObject_MakeTpCall (/usr/bin/python3.12+0x54924d) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #47 0x5d73c8 in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5d73c8) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #48 0x54aa99 in _PyObject_Call_Prepend (/usr/bin/python3.12+0x54aa99) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #49 0x5a3627  (/usr/bin/python3.12+0x5a3627) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #50 0x54924d in _PyObject_MakeTpCall (/usr/bin/python3.12+0x54924d) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #51 0x5d73c8 in _PyEval_EvalFrameDefault (/usr/bin/python3.12+0x5d73c8) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #52 0x5d58ea in PyEval_EvalCode (/usr/bin/python3.12+0x5d58ea) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #53 0x608b41  (/usr/bin/python3.12+0x608b41) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #54 0x6b4e92  (/usr/bin/python3.12+0x6b4e92) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #55 0x6b4bf9 in _PyRun_SimpleFileObject (/usr/bin/python3.12+0x6b4bf9) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #56 0x6b4a2e in _PyRun_AnyFileObject (/usr/bin/python3.12+0x6b4a2e) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #57 0x6bca94 in Py_RunMain (/usr/bin/python3.12+0x6bca94) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #58 0x6bc57c in Py_BytesMain (/usr/bin/python3.12+0x6bc57c) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)
    #59 0x7ff285c121c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #60 0x7ff285c1228a in __libc_start_main_impl ../csu/libc-start.c:360
    #61 0x657ce4 in _start (/usr/bin/python3.12+0x657ce4) (BuildId: 37451b37c71cb46f8ccb27cb3cdbb7aa004b9987)

0x514000cfe040 is located 0 bytes inside of 448-byte region [0x514000cfe040,0x514000cfe200)
freed by thread T0 here:
    #0 0x7ff2860325e8 in operator delete(void*, unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:164
    #1 0x7fefad080978 in std::default_delete<nvfuser::Val>::operator()(nvfuser::Val*) const /usr/include/c++/13/bits/unique_ptr.h:99
    #2 0x7fefad080978 in std::__uniq_ptr_impl<nvfuser::Val, std::default_delete<nvfuser::Val> >::reset(nvfuser::Val*) /usr/include/c++/13/bits/unique_ptr.h:211
    #3 0x7fefad080978 in std::__uniq_ptr_impl<nvfuser::Val, std::default_delete<nvfuser::Val> >::operator=(std::__uniq_ptr_impl<nvfuser::Val, std::default_delete<nvfuser::Val> >&&) /usr/include/c++/13/bits/unique_ptr.h:191
    #4 0x7fefad080978 in std::__uniq_ptr_data<nvfuser::Val, std::default_delete<nvfuser::Val>, true, true>::operator=(std::__uniq_ptr_data<nvfuser::Val, std::default_delete<nvfuser::Val>, true, true>&&) /usr/include/c++/13/bits/unique_ptr.h:243
    #5 0x7fefad080978 in std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >::operator=(std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >&&) /usr/include/c++/13/bits/unique_ptr.h:414
    #6 0x7fefad080978 in std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >* std::__copy_move_backward<true, false, std::random_access_iterator_tag>::__copy_move_b<std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*>(std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*) /usr/include/c++/13/bits/stl_algobase.h:732
    #7 0x7fefad080978 in std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >* std::__copy_move_backward_a2<true, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*>(std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*) /usr/include/c++/13/bits/stl_algobase.h:769
    #8 0x7fefad080978 in std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >* std::__copy_move_backward_a1<true, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*>(std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*) /usr/include/c++/13/bits/stl_algobase.h:778
    #9 0x7fefad080978 in __gnu_cxx::__enable_if<std::__is_random_access_iter<std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::iterator_traits<std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*>::iterator_category>::__value, std::_Deque_iterator<std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >&, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*> >::__type std::__copy_move_backward_a1<true, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> > >(std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*, std::_Deque_iterator<std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >&, std::unique_ptr<nvfuser::Val, std::default_delete<nvfuser::Val> >*>) /usr/include/c++/13/bits/deque.tcc:1189

previously allocated by thread T0 here:
    #0 0x7ff286031548 in operator new(unsigned long) ../../../../src/libsanitizer/asan/asan_new_delete.cpp:95
    #1 0x7fefadeb46d7 in nvfuser::TensorView* nvfuser::IrBuilder::clone<nvfuser::TensorView>(nvfuser::TensorView const*, nvfuser::IrCloner*) /opt/pytorch/Fuser/csrc/ir/cloner.h:183

SUMMARY: AddressSanitizer: heap-use-after-free /opt/pytorch/Fuser/csrc/ir/cloner.cpp:49 in nvfuser::IrCloner::handle(nvfuser::Statement const*)
Shadow bytes around the buggy address:
  0x514000cfdd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x514000cfde00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x514000cfde80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x514000cfdf00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x514000cfdf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x514000cfe000: fa fa fa fa fa fa fa fa[fd]fd fd fd fd fd fd fd
  0x514000cfe080: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x514000cfe100: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x514000cfe180: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x514000cfe200: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x514000cfe280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==86448==ABORTING

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants