Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP connection lost: trying to reconnect #146

Open
gnssTim opened this issue Dec 18, 2024 · 11 comments
Open

TCP connection lost: trying to reconnect #146

gnssTim opened this issue Dec 18, 2024 · 11 comments

Comments

@gnssTim
Copy link

gnssTim commented Dec 18, 2024

If a tcp device is physically disconnected after the connection has been established. It remains in a running state indefinitely and does not try to reconnect.

The function “AsyncManager::runWatchdog” is call cyclically and “running_” Boolean stay true.

@thomasemter
Copy link
Contributor

Hi @gnssTim,

which driver version are you using and are you connected with real Ethernet or via USB?

@gnssTim
Copy link
Author

gnssTim commented Dec 18, 2024

Hi Thomas,

I use the ros2 driver version tag V1.4.1 over a real Ethernet network. launch with "ros2 septentrio_gnss_driver rover_node.launch.py" under linux.

The gnss device is "AsteRx SBi3 Pro+".

@thomasemter
Copy link
Contributor

Hi Tim,

ok thanks. I will look into it. Actually, the driver should detect a disconnect and try to reconnect.

What happens, when you reconnect physically?

@gnssTim
Copy link
Author

gnssTim commented Dec 18, 2024

When i reconnect physically nothing happend, no new data was publish to dds.
When i use wireshark i didn't see any new data on the port.
The send data status on gnss receiver side no longer green.

@gnssTim
Copy link
Author

gnssTim commented Dec 19, 2024

For additional informations, I have enable DEBUG the log, the last message i see is "AsyncManager sync read error: Connection reset by peer"

more log :

[septentrio_gnss_driver_node-1] 1734612188.761326482: [septentrio_gnss_driver] [INFO] Connecting to tcp://192.168.122.113:2002...
[septentrio_gnss_driver_node-1] 1734612188.769316173: [septentrio_gnss_driver] [INFO] Connected to 192.168.122.113:2002.
[septentrio_gnss_driver_node-1] 1734612188.775450746: [septentrio_gnss_driver] [DEBUG] Configure Rx.
[septentrio_gnss_driver_node-1] 1734612188.776880567: [septentrio_gnss_driver] [DEBUG] Called configureRx() method
[septentrio_gnss_driver_node-1] 1734612188.777795410: [septentrio_gnss_driver] [DEBUG] AsyncManager sent the following 13 bytes to theSSSSSSSSSS
[septentrio_gnss_driver_node-1] 1734612287.646805859: [septentrio_gnss_driver] [DEBUG] AsyncManager sync read error: Connection reset by peer

@thomasemter
Copy link
Contributor

thomasemter commented Dec 19, 2024

For additional informations, I have enable DEBUG the log, the last message i see is "AsyncManager sync read error: Connection reset by peer"

more log :

[septentrio_gnss_driver_node-1] 1734612188.761326482: [septentrio_gnss_driver] [INFO] Connecting to tcp://192.168.122.113:2002...
[septentrio_gnss_driver_node-1] 1734612188.769316173: [septentrio_gnss_driver] [INFO] Connected to 192.168.122.113:2002.
[septentrio_gnss_driver_node-1] 1734612188.775450746: [septentrio_gnss_driver] [DEBUG] Configure Rx.
[septentrio_gnss_driver_node-1] 1734612188.776880567: [septentrio_gnss_driver] [DEBUG] Called configureRx() method
[septentrio_gnss_driver_node-1] 1734612188.777795410: [septentrio_gnss_driver] [DEBUG] AsyncManager sent the following 13 bytes to theSSSSSSSSSS
[septentrio_gnss_driver_node-1] 1734612287.646805859: [septentrio_gnss_driver] [DEBUG] AsyncManager sync read error: Connection reset by peer

This looks like it was disconnected before setting up the RX completely. This case is not handled explicitly. You should just restart the driver if this happens.

@thomasemter
Copy link
Contributor

thomasemter commented Dec 19, 2024

When i reconnect physically nothing happend, no new data was publish to dds. When i use wireshark i didn't see any new data on the port. The send data status on gnss receiver side no longer green.

I will look into it. But it might take some time.

@gnssTim
Copy link
Author

gnssTim commented Dec 20, 2024

Hello Thomas,

OK, thanks for Rx configuration.
I have switch the GNSS from TCP (send only) to TCP2Way, now the configuration success.
Of course, this doesn't solve the problem, I test with “configure_rx: false” to be sure, It’s the same.

I'm able to detect the deconnection event on “async_manager.hpp” line 630

node_->log(log_level::DEBUG, "AsyncManager string read fault, wrong number of bytes read: " + std::to_string(numBytes));

by adding this line
if(ec.value() == boost::system::errc::connection_reset)
{
// do reconnection
}

I suppose the reconnection need to be done into class TcpIO ?

@thomasemter
Copy link
Contributor

thomasemter commented Dec 20, 2024

I am sorry but I was not able to reproduce your problem. I can unplug and replug the ethernet cable and the data stream is re-established immediately.

I can confirm your findings insofar that the disconnect is not recognized. This should be ok, since a disconnect on the physical connection (layer 1) is not immediately propagated to TCP (layer 4).

Your statement about TCP2Way made me wonder if you have some other setup issues. Could you please share your config.yaml?

@gnssTim
Copy link
Author

gnssTim commented Dec 20, 2024

Thank you for doing the test on your side. Your help was very helpful. I tested on another hardware and it works, it seems that the problem comes from the virtualization.

With virtualization, the connection is closed when the cable is disconnected (“RST” reset flag capture with wireshark).

The problem come from the virtualization, I didn't find any solution for the moment.

Thanks a lot Thomas.

@thomasemter
Copy link
Contributor

Thanks for reporting back. Good to know it works on the other hardware.

I also did some more tests and after several minutes, the connection loss is detected. The reconnection attempts after that are not successful. I will investigate further. I think this will take some time but I hope this will also fix your problems with virtualization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants