Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK is not trying to reconnect if a publish was not successful #608

Open
jakob-sturm opened this issue Feb 4, 2025 · 5 comments
Open

SDK is not trying to reconnect if a publish was not successful #608

jakob-sturm opened this issue Feb 4, 2025 · 5 comments
Labels
guidance Question that needs advice or information.

Comments

@jakob-sturm
Copy link

Discussed in #607

Originally posted by jakob-sturm January 31, 2025
Hi,

I have the following setup:
MQTT: 3.1.1
KeepAlive: 20 minutes
QoS: 1

Now when my connection has changed under the hood (for example the modem switched carrier, but was offline for only a few seconds) and I want to publish a message, the message doesn't go through. That's fine, but it takes up to 20 minutes or longer (I guess due to the KeepAlive interval) until the SDK recognises that the connection was interrupted. Then it immediately is able to resume the connection and deliver the message. Why does it take so long and can I configure the SDK somehow to detect interrupted connections earlier and reconnect faster?

@bretambrose
Copy link
Contributor

A change like that can only reliably be detected via a timeout. The two timeouts available to you are MQTT keep alive (which you've set to 20 minutes) and TCP keep alive (off by default) which is more complicated to set (https://github.com/awslabs/aws-crt-python/blob/main/awscrt/io.py#L216-L224 followed by https://github.com/awslabs/aws-crt-python/blob/main/awscrt/mqtt5.py#L1319 followed by https://github.com/aws/aws-iot-device-sdk-python-v2/blob/main/awsiot/mqtt5_client_builder.py#L10)

@bretambrose bretambrose added guidance Question that needs advice or information. closing-soon This issue will automatically close in 5 days unless further comments are made. labels Feb 4, 2025
@jakob-sturm
Copy link
Author

Thanks for your response. In my case I don't want to set additional keep alives because this is consuming a lot of data if sent frequently.

I would create a workaround now which would be setting a timeout (like 30s) in the future of a publish. When the timeout is reached, I would actively disconnect and try to connect the same connection again. Do you think that works?

@github-actions github-actions bot removed the closing-soon This issue will automatically close in 5 days unless further comments are made. label Feb 5, 2025
@jakob-sturm
Copy link
Author

In order to help others, this is what I found out about how the SDK is behaving. Disconnecting and connecting the existing connection object works for resuming a persistent session. However when creating a new connection object (even though cleanSession is set to False and clientId is the same), it does not resume the old connection and does not automatically resubscribe to the existing topics. This is in contrast to the documentation of persistent sessions in my opinion and is an issue in the SDK. Do you agree @bretambrose ?

@bretambrose
Copy link
Contributor

No, I do not agree. Persistent sessions are a server-side construct. The client's role is to communicate the developer's desires and communicate back whether or not a persistent session was joined on CONNACK receipt. The SDK does that.

MQTT clients are under no obligation to re-subscribe to previously-subscribed-to topic filters should you fail to resume a session and I would argue that while seemingly a good idea, it is actually not.

@jakob-sturm
Copy link
Author

I understand but can you explain why the server (AWS IoT Core) does not let me resume the connection when creating a new connection object while letting me resume it when using the old connection object? Can you reproduce that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

2 participants