-
Notifications
You must be signed in to change notification settings - Fork 116
fix #442: race conditions when updating PlanningScene state #716
Conversation
I don't use the Rviz plugin and don't run into these race conditions in my normal preferred workflow so don't fully understand the details at play here, but I'm glad you are tackling this. Hopefully @v4hn can help. Perhaps @jbohren has some insight since he's improved MoveIt! race conditions in the past. I'll continue to do high-level reviews. |
Proof-of-concept implementation on top of #713. |
unlocking needs to be performed in reverse order of locking otherwise deadlocks are risked
@v4hn I cleaned up the commit history. Please review now. |
TODOs: |
- unlock mutexes in reverse order of locking - reformatting - PlanningSceneMonitor::syncSceneUpdates(ros::Time): sync all scene updates upto given time - syncSceneUpdates() after execution of trajectories & before fetching current robot state - sync scene update before starting planning - validate trajectory before execution
Python API works without sleeps like a charm. |
Can you rename the subject of the PR to something more descriptive than work-in-progress for future reference? and so to ensure its ready for merging ;-) |
Also, with 8 commits in this PR its my experience that reviewers request they be merged into fewer (preferably 1). |
ros::CallbackQueue callback_queue_; | ||
boost::scoped_ptr<ros::AsyncSpinner> spinner_; | ||
ros::Time last_robot_motion_time_; /// Last time the robot has moved | ||
bool enforce_next_state_update_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its out of style to suddenly start aligning variable names, but I guess its not a big deal
@@ -139,7 +139,9 @@ void MotionPlanningFrame::computeExecuteButtonClicked() | |||
if (move_group_ && current_plan_) | |||
{ | |||
ui_->stop_button->setEnabled(true); // enable stopping | |||
bool success = move_group_->execute(*current_plan_); | |||
bool success = | |||
move_group_->validatePlan(*current_plan_) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of chaining these, i think it would be much nicer to have a console output explaining why validatePlan failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the console output in validatePlan().
…ates upto given time
…current robot state
fixes raise conditions in updating of PlanningScene state - unlock mutexes in reverse order of locking - PlanningSceneMonitor::syncSceneUpdates(ros::Time): sync all scene updates upto given time - syncSceneUpdates() after execution of trajectories & before fetching current robot state - sync scene update before starting planning - validate trajectory before execution - final fixup #724
// start our own spinner listening on our own callback_queue to become independent of any global callback queue | ||
root_nh_.setCallbackQueue(&callback_queue_); | ||
spinner_.reset(new ros::AsyncSpinner(1 /* threads */, &callback_queue_)); | ||
spinner_->start(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alainsanguinetti Could you have a look at this commit w.r.t. your pull-request #701?
Starting a spinner in the monitor obviously takes control out of the user's hands again.
Even so, afaics the PlanningSceneMonitor always used the global callback queue, right?
I don't see a way to provide the nodehandles the psm uses...
I suppose at least we should provide the user with a way to override the control flow from the outside.
tf/tf2 implement a similar scheme and they use an optional parameter spin_thread
in the constructor, but this breaks ABI.
Is it enough to add the optional parameter in kinetic and leave indigo with the background thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably a nice fix. Maybe we should allow the user to specify how many threads he wants to use.
The user does not really see more than this since it is encapsulated in the move_group node, so it is probably more a matter of how many more threads does he allow on his computer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Wed, Aug 10, 2016 at 08:59:31PM -0700, Alain Sanguinetti wrote:
I think this is probably a nice fix.
I see two problems with this:
-
It should definitely be optional(enabled by default) because some people do not want to have unmanaged threads running on their systems.
See your very own request Added support for nodes handles with specific callbacks queues #701 for a recent example. -
For unclear reasons it has been explicitly disallowed to start ASyncSpinner's for different threads. See roscpp: Multiple AsyncSpinner behavior is confusing and precludes introspection ros/ros_comm#277
The commits in question are definitely faulty, but we are unsure about the details at the moment.
As a result this approach will break any code that instantiates a PlanningSceneMonitor in one thread and an(other)AsyncSpinner
in a different thread.
One example for this is RViz.An ugly workaround to this, that had been used in other places in MoveIt, is to require (without checking) a running
AsyncSpinner
in the background
when certain functions are called. See https://github.com/ros-planning/moveit/blob/kinetic-devel/moveit_ros/planning_interface/move_group_interface/include/moveit/move_group_interface/move_group.h#L619
If the bug in ros_comm would get fixed (also in indigo), we could remove that workaround.
Maybe we should allow the user to specify how many threads he wants to use.
I would prefer to have it fully optional. If, as an alternative, the user could specify a NodeHandle
for which he guarantees
that it will receive asyncronous updates this would include your use-case.
This wip branch is intended as joint collaboration branch to fix the race conditions in scene updates reported in #442. All developers are invited to contribute via additional PRs. I'm trying to centralize the discussion on this issue here.
Related PRs:
As pointed out in a comment of #671 the fundamental underlying reason might be that scene update messages pile up in the callback queue (due to sequential processing by a single spinner thread). Hence, not only pending joint state updates may be missed, but also other scene updates.
Hence I suggest to modify
planning_scene_monitor::LockedPlanningSceneRO
such that it waits until all scene updates withtimestamp < ros::Time::now()
are processed and only then returns the current planning scene. Only this will ensure, that all pending scene updates are considered at a given time.Proposed concrete changes:
Before starting along that route, I'm waiting for positive feedback from some developers. And of course, I hope for active contributions. Please announce before you start to tackle a sub task.