Skip to content

Deadlock when using user namespace (probably musl) #12514

Closed
@plietar

Description

@plietar

Describe the bug

When running Nix with a chroot store, ie. using Linux user namespaces, I occasionally get deadlocks just as the build is about to start.

I was able to capture stack traces by attaching gdb to the stuck process.
There are two nix processes, one that I had originally run and one fork of it.

The parent process is stuck waiting for the child:

#0  0x00000000015ff945 in __cp_end ()
#1  0x00000000015fcd39 in __syscall_cp_c ()
#2  0x00000000015f2a32 in waitpid ()
#3  0x0000000000895423 in nix::Pid::wait() ()
#4  0x0000000000863bbd in nix::userNamespacesSupported()::{lambda()#1}::operator()() const [clone .isra.0] ()
#5  0x0000000000863dcd in nix::userNamespacesSupported() ()
#6  0x0000000000863ea8 in nix::mountAndPidNamespacesSupported() ()
#7  0x0000000000f5c9eb in nix::LocalDerivationGoal::tryLocalBuild(nix::LocalDerivationGoal::tryLocalBuild()::_ZN3nix19LocalDerivationGoal13tryLocalBuildEv.Frame*) [clone .actor] ()
#8  0x0000000000e14f08 in nix::Goal::work() ()
#9  0x0000000000e24254 in nix::Worker::run(std::set<std::shared_ptr<nix::Goal>, nix::CompareGoalPtrs, std::allocator<std::shared_ptr<nix::Goal> > > const&) ()
#10 0x0000000000e0f52b in nix::Store::buildPathsWithResults(std::vector<nix::DerivedPath, std::allocator<nix::DerivedPath> > const&, nix::BuildMode, std::shared_ptr<nix::Store>) ()
#11 0x00000000014ea379 in nix::Installable::build2(nix::ref<nix::Store>, nix::ref<nix::Store>, nix::Realise, std::vector<nix::ref<nix::Installable>, std::allocator<nix::ref<nix::Installable> > > const&, nix::BuildMode) ()
#12 0x00000000014ec260 in nix::Installable::build(nix::ref<nix::Store>, nix::ref<nix::Store>, nix::Realise, std::vector<nix::ref<nix::Installable>, std::allocator<nix::ref<nix::Installable> > > const&, nix::BuildMode) ()
#13 0x0000000000584337 in CmdBuild::run(nix::ref<nix::Store>, std::vector<nix::ref<nix::Installable>, std::allocator<nix::ref<nix::Installable> > >&&) ()
#14 0x00000000014e80b2 in nix::InstallablesCommand::run(nix::ref<nix::Store>, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&&) ()
#15 0x00000000014dbcc2 in nix::RawInstallablesCommand::run(nix::ref<nix::Store>) ()
#16 0x00000000014bdff7 in nix::StoreCommand::run() ()
#17 0x000000000062fb67 in nix::mainWrapped(int, char**) ()
#18 0x000000000147b5bc in nix::handleExceptions(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>) ()
#19 0x00000000004cf5d7 in main ()

And the child is stuck acquiring locks for memory allocation:

#0  0x00000000015fc86b in __lock ()
#1  0x00000000015e7472 in __libc_malloc_impl ()
#2  0x000000000152526c in operator new(unsigned long) ()
#3  0x0000000000864f34 in nix::makeSimpleLogger(bool) ()
#4  0x0000000000895d2a in std::_Function_handler<void (), nix::startProcess(std::function<void ()>, nix::ProcessOptions const&)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
#5  0x0000000000893d2e in nix::childEntry(void*) ()
#6  0x00000000015ff912 in __clone ()
#7  0x0000000028744f18 in ?? ()
#8  0x0086e00000000000 in ?? ()
#9  0x0000000000000000 in ?? ()

My guess is that this is a classic case of thread A acquires a lock (here from malloc), thread B forks, thread A releases the lock. The child process tries to acquire the lock however in this process thread A does not exist, so the lock is never released.

Metadata

$ nix --version
nix (Nix) 2.24.10

I'm running the static build of the nix command, since my host does not have a Nix store.

The version I used to capture those stack traces is a little outdated, but the relevant code (libutil/unix/processes.cc and libutil/linux/namespaces.cc) has barely changed since that version.

I've had a look through the Git log and open issues and did not find any mention of this.


Add 👍 to issues you find important.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugderivation-buildThe process of building an individual derivation (see also sandbox label)

    Type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions