You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When we tested communication with our sensors, we faced problems with bad E2E CRC checks, occasionally. When the problem occurs, it remains until communication is reset. The messages are notifications, segmented using TP and protected with E2E.
My understanding of the problem is the following:
If a single TP segment gets lost, the vsomeip tp-reassembler cannot finish this message. So far so good.
However, now next message is received, segment by segment. The old message is still there waiting to be completed. So for the first few segments we might get a duplicate segement error. As soon as the missing segment from the old message is received, the message is regarded as complete and returned. Then the E2E check is being processed. As we have reassembled the message from segments from actually two consecutive messages the CRC check fails and we have garbage data.
From now on, all messages will be reassembled from mixed segments without a duplicate segment error on the log. Hence the CRC will fail for all messages and the data is actually garbage.
e.g.:
message consists of 6 segments (0...5)
receive segments 0,2,3,4,5 and loose segment 1 from the first message
receive segment 0 of second message -> duplicate segment error
receive segment 1 of second message -> message is complete and returned --> CRC error
segment 2,3,4,5 added to new tp message
receive segment 0 an 1 from third message -> added to previous message, complete and return --> CRC error
...
Reproduction Steps
It's hard to reproduce. Somehow remove one TP segment from the communication.
Expected behaviour
In my opinion a missing segment should not invalidate all upcoming traffic.
The problem could be resolved by various actions:
Lower the message reassembling timeout to less than the message frequency. So incomplete messages will be deleted before the next message arrives. The timeout is 5 seconds hardcoded at the moment and hence not very helpful.
Force start of a new TP message by segments with offset zero. Remaining incomplete messages will be discarded. Other segments cannot start a new tp message.
I think the first solution is the better one, as it does not introduce as many implications on the order of the segments arriving.
Logs and Screenshots
No response
The text was updated successfully, but these errors were encountered:
After revising the SOME/IP TP Spec, I changed my mind: The spec is quite particular about when message reassembly should be interrupted and a new message should start. In other word vsomeip does not obey the specs in this regard.
vSomeip Version
v3.4.10
Boost Version
any
Environment
All
Describe the bug
When we tested communication with our sensors, we faced problems with bad E2E CRC checks, occasionally. When the problem occurs, it remains until communication is reset. The messages are notifications, segmented using TP and protected with E2E.
My understanding of the problem is the following:
If a single TP segment gets lost, the vsomeip tp-reassembler cannot finish this message. So far so good.
However, now next message is received, segment by segment. The old message is still there waiting to be completed. So for the first few segments we might get a duplicate segement error. As soon as the missing segment from the old message is received, the message is regarded as complete and returned. Then the E2E check is being processed. As we have reassembled the message from segments from actually two consecutive messages the CRC check fails and we have garbage data.
From now on, all messages will be reassembled from mixed segments without a duplicate segment error on the log. Hence the CRC will fail for all messages and the data is actually garbage.
e.g.:
Reproduction Steps
It's hard to reproduce. Somehow remove one TP segment from the communication.
Expected behaviour
In my opinion a missing segment should not invalidate all upcoming traffic.
The problem could be resolved by various actions:
I think the first solution is the better one, as it does not introduce as many implications on the order of the segments arriving.
Logs and Screenshots
No response
The text was updated successfully, but these errors were encountered: