Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of https://github.com/ros2/rclcpp/pull/2683 #2714

Open
wants to merge 3 commits into
base: rolling
Choose a base branch
from

Conversation

jmachowinski
Copy link
Contributor

We did a quick fix of #2664 in order to get it merged fast in jazzy.

This is the proper fix for the issue reported in #2664

@jmachowinski
Copy link
Contributor Author

@mjcarroll @wjwwood @fujitatomoya @alsora
While doing the cleanup I found some other issues in the code. We should definitely backport the double adding of the notifier the jazzy....

@jmachowinski
Copy link
Contributor Author

jmachowinski commented Dec 18, 2024

There is still one issue going on, were I don't know if its a bug in the rmw or not.
The reason why I don't know if it is a bug or not, is that it is not clearly specified what
the behavior is if I add an entity twice to a waitset.

If you add a guard condition twice to a rcl_waitsets, the behavior is odd.
If the guard condition is triggered, only the first entry in the rcl_waitset is set, the second is nullptr.

The problem here, is that this breaks code like this :

  rcl_guard_condition_t * cond = &guard_condition_->get_rcl_guard_condition();

  size_t idx1;
  rcl_wait_set_add_guard_condition(&wait_set, cond, &idx1);

  size_t idx2;
  rcl_wait_set_add_guard_condition(&wait_set, cond, &idx2);

  guard_condition_->trigger();

  rcl_wait(&wait_set, 1);
  
  if(wait_set.guard_conditions[idx1] != nullptr)
  {
    // do something
  }

 // this entry will be a nullptr, even though it should not.
  if(wait_set.guard_conditions[idx2] != nullptr)
  {
    // do something
  }

@clalancette
Copy link
Contributor

If you add a guard condition twice to a rcl_waitsets, the behavior is odd.

Which RMW are you testing with? While the behavior should in theory be consistent across them, they all implement it fairly differently.

@jmachowinski
Copy link
Contributor Author

@clalancette I used the default, so this should have been fastDDS.
The big question here is, what is the wanted behavior ?
If we all agree, that the second guard condition, should not be null as well, I can come up with a test and patch....

Janosch Machowinski added 3 commits January 9, 2025 18:33
Signed-off-by: Janosch Machowinski <[email protected]>
…ditions

A waitable should make sure, that all entities it adds to the waitset
are alive during the wait. The ExecutorNotifyWaitable did not do this,
leading to crashes, if a callback group was dropped while waiting.
This commit changes this behavior, by holding shared pointers to the
used guard condition.
Also while at it, fixed a possible race, were a trigger could get lost.
Optimized the is_ready call by using the indices returned by rcl.

Signed-off-by: Janosch Machowinski <[email protected]>
Comment on lines +97 to 98
const auto & guard_condition = guard_holder.strong_reference;
if (!guard_condition) {continue;}
Copy link
Member

@wjwwood wjwwood Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem right to me, are we not holding the strong_reference unconditionally? And if not should we not check if the weak one locks?

And it seems like we only use the weak_reference for comparison, then why not just have the strong reference, use it for comparison, and not have the GuardHolder at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, if we hold the shared_ptr, the lock of the weak_ptr will always work, and we don't need to do all of this, I will adjust the code.

@wjwwood
Copy link
Member

wjwwood commented Jan 10, 2025

The big question here is, what is the wanted behavior ?
If we all agree, that the second guard condition, should not be null as well, I can come up with a test and patch....

We discussed this in the ROS 2 PMC meeting as well as in the latest client library working group meeting, but for everyone else's benefit the conclusion was that adding the same entity (guard condition, timer, sub, etc...) to a wait set should be an error and we shouldn't require the rmw layer to allow it and leave both non-null. I believe the agreement was to make the check in the functions that add the entities to the wait set, but we may also try to avoid this in the calling code in rclcpp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants