Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libroadrunner==2.2.0 results in segmentation faults on almost all my models/workflows #146

Open
matthiaskoenig opened this issue May 10, 2022 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@matthiaskoenig
Copy link
Owner

Not sure what happened since around mid of December, but the latest libroadrunner==2.2.0 and libroadrunner-experimental==2.2.0 results in segmentation faults and dying linux kernels on most of my workflows/simulations.

I.e. I get things such as

*** SIGSEGV received at time=1644920070 on cpu 10 ***
PC: @     0x7fcad4f2851e  (unknown)  std::default_delete<>::operator()()
    @     0x7fcb388d93c0  1075403472  (unknown)
    @     0x7fcad4f26bde         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4f6c636         32  rrllvm::Jit::~Jit()
    @     0x7fcad4f80250         32  rrllvm::MCJit::~MCJit()
    @     0x7fcad4f8026c         32  rrllvm::MCJit::~MCJit()
    @     0x7fcad4efcb16         32  std::default_delete<>::operator()()
    @     0x7fcad4efc2be         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4f45f90        848  rrllvm::ModelResources::~ModelResources()
    @     0x7fcad4edc728         48  std::_Sp_counted_ptr<>::_M_dispose()
    @     0x7fcad4dd8c2d        128  std::_Sp_counted_base<>::_M_release()
    @     0x7fcad4dcf3fb         32  std::__shared_count<>::~__shared_count()
    @     0x7fcad4ed605c         32  std::__shared_ptr<>::~__shared_ptr()
    @     0x7fcad4ed6078         32  std::shared_ptr<>::~shared_ptr()
    @     0x7fcad4ec7f23        432  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
    @     0x7fcad4ec7f86         32  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
    @     0x7fcad4e4bc9f         32  std::default_delete<>::operator()()
    @     0x7fcad4e4810a         64  std::unique_ptr<>::~unique_ptr()
    @     0x7fcad4e46453        448  rr::RoadRunnerImpl::~RoadRunnerImpl()
    @     0x7fcad4e15dce         48  rr::RoadRunner::~RoadRunner()
    @     0x7fcad4e15dfa         32  rr::RoadRunner::~RoadRunner()
    @     0x7fcad4d7c962        112  _wrap_delete_RoadRunner
    @     0x7fcad4d498c0        144  SwigPyObject_dealloc
    @           0x532b95  (unknown)  (unknown)
    @           0x8feca0  (unknown)  (unknown)
[2022-02-15 11:14:30,802 E 2063541 2063541] logging.cc:317: *** SIGSEGV received at time=1644920070 on cpu 10 ***
[2022-02-15 11:14:30,802 E 2063541 2063541] logging.cc:317: PC: @     0x7fcad4f2851e  (unknown)  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcb388d93c0  1075403472  (unknown)
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f26bde         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f6c636         32  rrllvm::Jit::~Jit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f80250         32  rrllvm::MCJit::~MCJit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f8026c         32  rrllvm::MCJit::~MCJit()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4efcb16         32  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4efc2be         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4f45f90        848  rrllvm::ModelResources::~ModelResources()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4edc728         48  std::_Sp_counted_ptr<>::_M_dispose()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4dd8c2d        128  std::_Sp_counted_base<>::_M_release()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4dcf3fb         32  std::__shared_count<>::~__shared_count()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ed605c         32  std::__shared_ptr<>::~__shared_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ed6078         32  std::shared_ptr<>::~shared_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ec7f23        432  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4ec7f86         32  rrllvm::LLVMExecutableModel::~LLVMExecutableModel()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e4bc9f         32  std::default_delete<>::operator()()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e4810a         64  std::unique_ptr<>::~unique_ptr()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e46453        448  rr::RoadRunnerImpl::~RoadRunnerImpl()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e15dce         48  rr::RoadRunner::~RoadRunner()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4e15dfa         32  rr::RoadRunner::~RoadRunner()
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4d7c962        112  _wrap_delete_RoadRunner
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @     0x7fcad4d498c0        144  SwigPyObject_dealloc
[2022-02-15 11:14:30,803 E 2063541 2063541] logging.cc:317:     @           0x532b95  (unknown)  (unknown)
[2022-02-15 11:14:30,804 E 2063541 2063541] logging.cc:317:     @           0x8feca0  (unknown)  (unknown)
Fatal Python error: Segmentation fault

I am pretty sure it is code related to the following: sys-bio/roadrunner#925 (merged beginning of January), i.e. the internal parallelization.
Please, please provide an roadrunner without any internal parallization, i.e. a single python thread on a single core! This will create issues in any multiprocessing on clusters.
The current libroadrunner=2.2.0 is not working for me at all. This is a big issue, because it breaks the scripts of all my students at the moment.

see sys-bio/roadrunner#963

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant