-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partitions, topics, and ROS namespacing #132
Comments
@Karsten1987 may have more to say on this, but we're already implementing this in ros2/ros2#327 and we've iterated on the design once with the community and once after figuring out that partitions might be a more useful way to handle the namespaces for topics. There was some discussion about it here: https://discourse.ros.org/t/ros2-and-dds-messaging/1556/5 That I wanted to reply to but have not due to not having time, but unless something comes up there I don't foresee any serious changes that need to be made. The only place where we may need to adjust what we're doing is to support publishing and subscribing to non-ROS dds topics, but I remain unsure as to how commonplace this will be and so I'm not sure how much demand there will be to figure this case out. But if people are interested in it, then that may prompt us to make changes to support that. However, as the design article now states, I think that the way partitions are being used by our conventions right now, when paired with an option to skip the ROS specific prefix, will allow many DDS topics to be subscribed to and published to from ROS. If anything needs to change or adapt due to this it will probably be the way we generate and use IDL files. Other than that case, I think this part of the system is relatively stable design wise. |
As for the partitions, we are in the latest swings to get the namespacing related pull requests reviewed and merged. During the development, we didn't encounter any technical challenges by using partitions on both supported DDS implementations, i.e. fastrtps and rti connext, so that I am confident that partitions was the right decision for namespaces and we stick to that decision. i am happy to hear you are looking forward to use ROS2. I hope, you have a chance to already test out the existing beta or master version of it. Every critical feedback is helpful. |
@spiderkeys Please be aware of the use case mentioned in the design doc about being able (or actually currently not being able) to communicate with "native" DDS topics (see http://design.ros2.org/articles/topic_and_service_names.html#communicating-with-non-ros-topics). If you plan to switch from RTI DDS to ROS 2 in a single step (all or nothing) that shouldn't be a problem. But if you ever want to mix native DDS participants and ROS 2 nodes there are currently two blockers:
I have tried to raise the concern about createsuch a "ROS island" within the DDS worlds in the past (when the ROS specific prefixes were introduced). Maybe you can provide feedback from a user point of view if e.g. mixing a "native" DDS publisher with a ROS subscriber is a use case you would be interested in? |
@wjwwood @Karsten1987 @dirk-thomas Thanks for the info. Hearing a lot of this makes me wish I had involved myself in the DDS middleware work early on, rather than being passive and waiting for it to get implemented. I believe that integrating with native/plain DDS is going to be important in the continued success/adoption of ROS2 and I'll explain my personal context in this matter. When I initially saw that ROS2 was going to be using DDS, I was thrilled because, to me, it meant that all of the work I had been doing was going to be able to integrate with ROS2. This was about 2+ years ago when I was working on a distributed robotics framework at NASA Langley for a project called the Autonomy Incubator (now the Langley Autonomy and Robotics Center). We were using RTI Connext for that project as well, and some of the software in our system was built on top of ROS1. ROS1 did a pretty good job of handling singular entities/robots, simulations, etc, but of course did not really scale well or have all of the great reliability/discovery features that DDS brought to the table. We created a manual DDS/ROS bridge library to overcome that issue in our system, which generally worked well but was not automated, thus a pain to use. With ROS2 being built upon DDS, my expectation was that (eventually) we would be able to easily connect future ROS2 applications to our existing system and interfaces. Fast forwarding to now, I am currently working at a robotics company called OpenROV that is working on an underwater drone called Trident. We are building our software on top of RTI DDS again, as it proved to be the most mature, stable, and high performance vendor implementation. We are doing some pretty cool things with it, especially around low latency video streaming between our vehicles and mobile devices. With ROS2 in the back of my mind as we started developing our software, I kept checking in now and then to see what the state of things was, though I must admit that I haven't tried out the alpha or beta builds enough to have much of an opinion on whether they are production ready (we simply took some of the public verbiage of it not being ready yet at face value). Since we are targeting consumer mobile platforms (Android, iOS, etc), having framework support for those platforms was crucial. RTI provided everything we needed here, including Java bindings for Android, so that was good and we were able to immediately get MVP implementations up and running within a month. FastRTPS seemed to be poised to provide Android and iOS support as well, so we did some prototyping with it, but ran into several issues around video streaming and large messages (I still have an unanswered issue on their repo: eProsima/Fast-DDS#83), which, when coupled with the pain of getting all of that running using native bindings with the NDK, strengthened our decision to run with RTI. I have to give a shoutout to @esteve here, who has been doing some amazing work around getting ROS2 bindings running for the Android platform. We considered going this route and building our apps with ROS2 Android, but a few things here left some doubts in our minds: 1.) FastRTPS did not seem to be functioning correctly for our needs, 2.) @esteve had put together support for RTI bindings, but there are still some licensing concerns here, as he had to make modifications to the RTI source to make it work, and 3.) we have toyed with utilizing some RTI only features for achieving high performance video and keeping our system upgrade-able, so we were wary about potentially not having full control over the DDS layer at the offset. Now, we find ourselves with a working system for a product that we will be shipping in the near future, and were operating on the following assumption which was reached by reading the ROS2 design goals, though now seems to be lacking some substance by having not closely followed the ROS2 implementation and discussions: Assumption: ROS2 would be able to integrate with existing DDS applications. After all, interoperability of data flows is what DDS is all about. Concern: With your above explanations of the state of the DDS mapping architecture in ROS2, it seems like it will be fine for us to add ROS2 types and QoS to our applications in order to allow them to connect to ROS2 applications, but it sounds like the reverse will not be true, since your idlgen rules and QoS subsets are going to be rather strict and limited in how they might be able to allow ROS2 apps to interact with native DDS apps. In my opinion, I think making mixing with existing DDS systems, rather than creating the "ROS-DDS island", is the right way to go for a few reasons:
One more concern I have is how ROS2 plans to handle "evolveability" of systems. RTI has implemented most, if not all of the XTYPES specification, which provides extensible types and dynamic data, and I'm not sure where other vendors are at on this these days or what parts of this ROS2 plans to support. We were planning to use extensibility in our system as a way of not breaking backwards compatibility as we rev applications distributed across several platforms, vehicles, and consumer devices. I haven't been able to find much on how ROS2 is planning on dealing with evolving/extending types to achieve the same thing. All of this said, I do understand how and why you have reached the architecture and structure that you have. DDS is very complex and you will never gain widespread adoption if you scare off users by requiring them to learn and understand all of it. I do hope though that you will leave enough flexibility inside of the ROS2 island to let your programs play with ours. I don't think the QoS/partitioning strategy is going to be a huge deal, but I do think that the message generation scheme is going to cause some difficulties. Maybe you can have an idlgen process which is separate from the rosmsg -> idl pipeline, and expose some kind of non-ROS Publisher/Subscriber abstraction that can use utilize those standalone types (and QoS). Perhaps a wrapper API with an interface similar to FastRTPS's which lets you use the simple pub/sub abstraction or the underlying base entities? I would be happy to discuss this further and hear your thoughts about the use cases I've described. |
There wasn't a whole lot of design discussion around the type system as it sort of evolved from whatever we thought was possible at the time. It's not like this stuff is set in stone. We're willing to make changes to improve different use cases still, but it will be work in some cases.
I agree that this is an important use case, and that's why I spent so much time talking about it in the referenced design document, but it will really take someone continuously trying to integrate ROS 2 into a ROS 2 / pure DDS hybrid system to ensure that continues to work. That of course starts with figuring out the existing roadblocks the first time it is tried. Many users will not care about this feature, but I don't imagine anyone will actively not want this feature unless it prevents some other functionality or it causes the tools and interfaces to be annoying. The latter is sort of the case we're in now, where we have reasons for these extra conventions and if we drop those conventions then we have to find another way to address the original reasons for having the conventions.
That's a fair summary. However, I think the current state will actually cover many use cases, for example:
In both of those cases, other systems are conforming to our conventions, which might not always be possible or desired, but at least it is technically possible. This might be annoying to the system integrators or the vendors, but at least there is a path forward. It seems to me that it will always be the case that the system with more conventions for message types and topic name patterns (in this case ROS 2) will require the other system to yield to those conventions, or else those conventions need to be circumventable. If the "other" system has it's own rules and conventions on top of DDS, then you might run into issues because it's not clear which is easier to adapt to the other.
We haven't decided yet. I don't think "use x-types" is a complete answer (to me x-types is a tool, but it doesn't help you figure out which types are compatible and/or how to handle changes to data structures automatically), and I don't think "make all fields optional" can be done in a performant way for users that care about that. So we have some work to do there yet. I have a plan in mind, but I haven't had the time to sit down and write it all out as a proposal. @dirk-thomas also might have more to say on this topic.
That's the intention, but we have to balance "ROS is easy to use" with "exposing all DDS QoS options and settings". This came up at the last ROSCon as well. I think it's possible, and potentially acceptable to me, that we just end up exposing all the DDS QoS settings more or less unchanged in the ROS 2 API. However, we've already started to find a few places where the DDS pattern isn't exactly to our liking. In those cases, one option is to just use it anyways to keep our code slim and take on the attitude of "well it works well enough for DDS users".
My guess is that the right strategy is to expose QoS settings as there is demand for them and people who think we're not moving fast enough in that direction can use the DDS API directly in the meantime. We've always planned to have a way to "reach" under our ROS objects to get the underlying, vendor-specific DDS objects so you can do whatever you want to them. "Reaching under" to the DDS objects is not ideal, since that breaks our vendor abstraction and makes code less generic, but I feel that this will only occur in rare cases, on the edges of systems. I know all of this is just me trying to justify that a ROS 2 island in a DDS ocean is not a bad thing, but I kind of feel that way. That being said, I want to make it as flexible as possible and as seamless as possible.
I agree that our translation from
That's basically what we have in @dirk-thomas said:
Actually I plan to add that in the pr I'm working on now. I was waiting for @Karsten1987's pr (ros2/ros2#327) to get it working before I looked at how to expose the option and disable it in special cases. |
@spiderkeys said:
@wjwwood said:
I think that I agree with @wjwwood's sentiments here. DDS is about data flows, but it provides no more support for interoperability than the concept of classes and APIs does in C++. It's up to the users of DDS to make their data types and topic names compatible if they want to be interoperable. ROS2 applies conventions on top of DDS to make it significantly more likely that DDS data flows using ROS2 are going to be interoperable. As I would need to do if I turned up on @spiderkeys's doorstep and asked to interoperate my DDS system with theirs, someone asking to interoperate another DDS system with ROS2 will need to, in some way, comply with the conventions of ROS2. How compliance is achieved is the question here, not whether or not ROS2 should be an island in the DDS ocean: Every system is an island in the DDS ocean based on its own topic and data type conventions. You could argue that ROS2 is a more isolated island due to abstracting many parts of DDS (hiding IDL, an abstracted API for QoS settings, etc.), but the result is (or should be) the same: a set of conventions that must be complied with to interoperate with DDS systems that happen to be ROS2-based. Whether compliance should be achieved by meeting in the middle, or whether the pure DDS user should need to go all the way to ROS2's conventions, is something that should be answered based on the impact on the complexity of using ROS2. |
Agreed. I will begin putting some effort into the integration of the two to help suss these roadblocks out.
Yes, I agree that it isn't a complete solution. Right now, our current plan to is try and adopt a policy of strictly "add or extend, but continue supporting all" when it comes to evolving message types, since we need to be able to support a number of client applications on different platforms, potentially at different versions than what is running on our vehicles. It isn't perfect, though, and I can easily see where cruft will build up, and we may sometimes be forced into situations where we end up with multiple readers/writers for divergent types. That, of course, ends up being sub-optimal from performance and complexity viewpoint, but the only ways I can see to avoid it right now are to leverage extended types or dynamic data. Dynamic data has the worst performance, of course, and with extensible types you run into the issue you mentioned where you have to start dealing with optional fields. So far, optional fields (and leveraging the fact that floating point types have NaN values) have seemed like the lesser of the evils. I look forward to learning about your thoughts on navigating these challenges.
I would advocate for this approach. I'm curious to hear which of the DDS policies wouldn't be desirable within ROS2, or at least don't seem to be worth the effort of adding them to the RMW layer. There are very few that I have not found use cases for within the context of robotics.
In my experience with DDS, the practice of implementing any desired concept/feature on top of pub/sub and the DDS design patterns is generally feasible and has been the path I've taken on a number of occasions. In a number of instances, RTI has been fairly helpful in coming up with clever ways to implement certain concepts using DDS constructs. I would be surprised if most of these problems couldn't be solved in some way on top of the features DDS provides (some of these solutions are coupled to using QoS effectively, another reason for why I would make sure all standard QoS policies are supported).
Yes, this would probably be my approach at the moment, if I was developing ROS2 applications that interfaced with existing systems. As long as the implementation provides a PSM-compliant DDS API, then you can avoid vendor-specific portability issues. One thing I worry about here, though, is that while FastRTPS provides wire interoperability with DDS, it does not implement the DDS C++ PSM. Because of this, there isn't a portable way to programmatically work at the DDS layer. If RMW exposed a more complete subset of DDS and the standard APIs, then this wouldn't be an issue and there would be more flexibility on the ROS2 application side to interface with the outside world. (Edit: Although I have to be honest here, only RTI and Prismtech have gone through the effort of being compliant with DDS-PSM-CXX, as far as I can tell) Referencing, ros2/rmw#51, @jacquelinekay writes:
This was back in 2015, so things may have changed, but is that last bit about keeping ROS2 open to other middlewares still a valid driver for keeping the QoS/DDS implementation small inside of RMW? If not, and ROS2 is fully committed to DDS now, perhaps it makes sense to continue fleshing out full support/abstraction. @gbiggs, I agree with your statements, but my main point in highlighting DDS's goal of data flow interoperability is that it is built upon the premise that if both sides know the types and know the contract (QoS), then they can connect without knowing anything else about each other. For a DDS-only developer with the flexibility of adding support for new types/qos to their system, all is well, because ROS2 has made the information available and implements a relatively simple subset of QoS. As any DDS implementation will provide that developer access to all of the APIs and tools they require to make the connections to ROS2 applications, they are able to do so. On the other side of the fence, ROS2 only provides a portable way to talk to other ROS2 applications, and there currently is not a good way of talking to the existing applications as the core functionality of talking about non-rosmsg-generated types, topics, or partitions (though there are designs to cover the last two) is not implemented/exposed, and some QoS policies are not implemented, based on my current understanding.
Of course, this is the crux of the issue, as all of the conversations above demonstrate. There is nothing inherently stopping ROS2 from being able to use non-ros types, topics, and QoS, it's simply a matter of managing ROS2 complexity and putting in the time to do work in RMW to support more common interfaces. It is completely understandable that until there is a strong enough demand or enough support provided in doing so, that the burden fall on the DDS user to play by ROS2 rules. That said, I would like to do what I can to start helping provide support for bridging that gap, if there is a clear path forward for what kinds of decisions need to be made and if there is an idea for what form this "pure DDS API" should take, if that is even the best route to go. Clearly, one step here is to start cooking up some ROS2 programs that attempt to integrate with our existing applications, to start identifying pain points in the process, which I will begin doing over the coming months. |
OK, I can see what you're saying now, and I agree with it.
This is something I would like to help out with, but I don't have a decent complex DDS system available here to drive the work for me. |
We discussed this again in our weekly meeting. If I can speak for the group (@ros2/team), we all continue to believe that this is a useful use case (interfacing a pure DDS system with ROS 2) and to that end that we should make sure it remains as easy as possible and as performant as possible to do this kind of integration. That means testing it and adding features or adjusting the implementation to make it easier over time. So what we're going to propose during our next planning meeting is to add a demo to beta 3 (roughly July-September) that demonstrates this kind of integration. It doesn't have to be a perfect solution, but by having a demo that gets compiled and tested with our other code we can know when a change is going to break that and it can serve as a place to try and test out improvements to make the use case work better. We haven't laid out the specific goals for this demo yet, so if anyone has any ideas for what would be compelling please offer them here. Personally, what I imagined for at least one part of the demo would be to write a camera driver (maybe like the OpenCV based ones we already have) using only Fast-RTPS or Connext (or both) and create a custom
It's not so much that they are undesirable, but that they might not be useful very frequently or even if they would be useful there's a reasonable workaround that doesn't require the special feature. I think ROS 1 got a long way with just reliable, queue size, and single message latching. Obviously I think we should have more features than that, but some of the QoS settings in DDS appear to be very niche. These are the QoS we currently expose more or less directly:
We've noticed some inconsistencies with how Durability is matched, which makes it not work like ROS 1 and in our opinions makes it less useful. So this is where we need to decide whether to just expose the DDS feature and let others get used to it or if we should build up our own concept, using DDS's feature to make it work well. On my list of "probably should be exposed in ROS 2 API, but it hasn't been requested or we haven't had time":
On my list of "could be useful but not clear there is need/demand for it (yet)" QoS settings:
There are yet more things we are controlling directly so that we can implement certain features in ROS 2 (may be exposed in part or through a different pattern but not directly):
I think the "add/expose them as needed to the ROS 2 API" approach is best here, because each one we expose, increases the coupling to DDS and increases the surface area of our interface which we have to document, teach, and test.
A goal since the beginning has been to keep DDS specific symbols (C/C++) out of the ROS 2 API. We are, however, still committed to "ROS 2" requiring that DDSI-RTPS is used on the wire and SPDP / SEDP being used for discovery over multicast-UDP. At least that's still the plan right now. If others want to put something else under the hood then it would be something different, like "ROS 2 with ZMQ and Friends" or "ROS 2 with OPC-UA" and those would not be "compatible" with just plain "ROS 2", but may have value on their own. We already have some other groups interested in replacing the DDS subsystem with other systems, whether it be a domain specific technology like OPC-UA (see this) / SOME/IP / AutoSAR / etc... or a "local" Part of the point of the I'm not saying I personally want to replace DDS with something else anytime soon, but I think adding that layer of insulation, conceptually as well as technically, is a good thing to do. Because in principle, something like
I don't see it as our mission to provide this kind of "actually portable" DDS api. But if that's a necessary byproduct of this work, then that's fine. |
I agree with this assessment, and would posit that these QoS are much less likely to be used in either ROS2 or general DDS applications. Of the six, I believe I have only used Presentation (for an experiment involving efficient H264 transmission, since abandoned), and Entity Factory (long ago enough to where I don't remember why). As far as durability/durability service goes, I think that volatile and transient-local should satisfy most usecases. Otherwise, ROS2 would have to implement its own "Persistence Service", which sounds like much more effort than its worth at this stage, lacking any clear demand.
Agreed. I'm getting a better idea for the design goals now, and would say that your approach of ensuring the availability/implementation of a set of communication concepts, insulated from the underlying middleware, is a mindset that is in the best interest of the ROS2 in the long run, even if DDS is technically adequate now.
True, I wouldn't expect it to be your goal either, though I have a feeling that it will naturally happen as a result of trying to achieve the integration goals touched on here.
How about a camera driver that encapsulates any generic, UVC-compatible camera? For our system, I've developed a "camera server" that provides some services around the cameras we use in our system, including:
I could help put this together using either FastRTPS or Connext, though the last time I tried using FastRTPS, I was unable to successfully create an asynchronous writer putting out data that was regularly larger than the 64k UDP max framesize. Maybe this is fixed now, though, or if it isn't, maybe this demo could help to push on any remaining roadblocks in that area. |
I've been reading up on the topic and service mapping article, having seen that the partition PR landed back in March: https://github.com/ros2/design/blob/gh-pages/articles/140_topic_and_service_name_mapping.md
I wanted to feel out how "locked in" this design decision is, as we are building a product on top of RTI DDS with the hopes of running full blown ROS 2.0 once it gets out of beta. To that end, we are trying to make sure that we make decisions around partitioning and topic names that are forward thinking and in line with what you are doing. The concepts settled upon seem very sound to me and I can't find any red flags or deal breakers.
Any expectations of there being major changes to this part of the architecture beyond this point?
The text was updated successfully, but these errors were encountered: