Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In debug mode an assertion in class loader fails when destoying the class loader #39

Closed
Rauldg opened this issue Oct 30, 2018 · 10 comments

Comments

@Rauldg
Copy link
Contributor

Rauldg commented Oct 30, 2018

Discovered trying to fix this other issue

The same minimal example, but compiled in debug mode creates this behavior:

At https://github.com/ros/class_loader/blob/0.3.7/src/class_loader_core.cpp#L497

It triggers:

install/include/boost/thread/pthread/recursive_mutex.hpp:113: void boost::recursive_mutex::lock(): Assertion `!pthread_mutex_lock(&m)' failed.
@arneboe
Copy link
Contributor

arneboe commented Oct 30, 2018

I am unable to reproduce the bug.
What boost version are you using? Can you upload a complete working example that I can use to test? Maybe there is something different between your minimal example implementation and mine.

@arneboe
Copy link
Contributor

arneboe commented Oct 30, 2018

ok, now am able to reproduce it :) Mistake on my side

@arneboe
Copy link
Contributor

arneboe commented Oct 30, 2018

This happens because pthread_mutex_lock returns EINVAL during tear down of the application.
According to the man page there are two reasons why that might happen:

  • The value specified by mutex does not refer to an initialized mutex object.
  • The mutex was created with the protocol attribute having the value PTHREAD_PRIO_PROTECT and the calling thread's priority is higher than the mutex's current priority ceiling.

I don't think that the second reason is very likely. It seems like the mutex is accessed after it has been destroyed. I have no idea how/why though

@arneboe
Copy link
Contributor

arneboe commented Oct 30, 2018

Ok, I poked at it some more:

all boost::recursive_locks are destroyed by the libc exit_handler. After that happens the class_loader tries to lock them. I do not understand why libc calls the recursive_lock destructor, though. It shouldn't....

@arneboe
Copy link
Contributor

arneboe commented Oct 30, 2018

Found it:

This is in class_loader_core.cpp

boost::recursive_mutex & getLoadedLibraryVectorMutex()
/*****************************************************************************/
{
  static boost::recursive_mutex m;
  return m;
}

The mutex is static local while the class_loader is static. The class_loader is created before the mutex. Thus the mutex is destroyed before the class_loader. The class_loader tries to access the mutex during destruction and obviously fails. This is something that should be fixed upstream (might very well be, we are on a really old version of the class_loader).

@Rauldg
Copy link
Contributor Author

Rauldg commented Oct 31, 2018

I don't think they changed this in their newer versions.

See:
https://github.com/ros/class_loader/blob/melodic-devel/src/class_loader_core.cpp#L48

Actually, there is some work going on trying to improve this but is an open issue:
ros/class_loader#33

@arneboe
Copy link
Contributor

arneboe commented Nov 1, 2018

Ok this is more complex than I though.

This is not an upstream bug. The upstream class_loader works fine as long as it is not statically allocated.

What happens is:

(1) the class_loader is statically allocated (because we use base::Singleton)
(2) upon first usage of the class_loader its internal mutexes are statically allocated.
(3) after the main() ends the static memory is freed in the inverse order of creation, thus the mutexes are deleted first.
(4) now the class_loader is deleted, however inside the destructor it tries to access the mutexes and crashes because they have already been deleted.

My solution would be to manually delete the class_loader pointer at the end of main. I can do it in a hacky way for now but ultimately this needs an extension to base::Singleton

@arneboe
Copy link
Contributor

arneboe commented Nov 1, 2018

The bug has been fixed in branch https://github.com/envire/envire-envire_core/tree/fix-classLoader

I had to add a feature to base-logging thus for now this fork of base-logging is required:
https://github.com/envire/base-logging/tree/feature-add-destroy

@arneboe arneboe closed this as completed Nov 1, 2018
@saarnold
Copy link
Member

saarnold commented Nov 1, 2018

Thanks for investigating this issue! 👍

@arneboe
Copy link
Contributor

arneboe commented Nov 1, 2018

We should document somewhere that one needs to manually destroy the ClassLoader before the program ends, I guess?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants