Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Missing TP segments corrupt all following TP messages #737

Open
siggie0815 opened this issue Jul 10, 2024 · 2 comments
Open

[BUG]: Missing TP segments corrupt all following TP messages #737

siggie0815 opened this issue Jul 10, 2024 · 2 comments
Labels

Comments

@siggie0815
Copy link

vSomeip Version

v3.4.10

Boost Version

any

Environment

All

Describe the bug

When we tested communication with our sensors, we faced problems with bad E2E CRC checks, occasionally. When the problem occurs, it remains until communication is reset. The messages are notifications, segmented using TP and protected with E2E.

My understanding of the problem is the following:
If a single TP segment gets lost, the vsomeip tp-reassembler cannot finish this message. So far so good.

However, now next message is received, segment by segment. The old message is still there waiting to be completed. So for the first few segments we might get a duplicate segement error. As soon as the missing segment from the old message is received, the message is regarded as complete and returned. Then the E2E check is being processed. As we have reassembled the message from segments from actually two consecutive messages the CRC check fails and we have garbage data.

From now on, all messages will be reassembled from mixed segments without a duplicate segment error on the log. Hence the CRC will fail for all messages and the data is actually garbage.

e.g.:

  • message consists of 6 segments (0...5)
  • receive segments 0,2,3,4,5 and loose segment 1 from the first message
  • receive segment 0 of second message -> duplicate segment error
  • receive segment 1 of second message -> message is complete and returned --> CRC error
  • segment 2,3,4,5 added to new tp message
  • receive segment 0 an 1 from third message -> added to previous message, complete and return --> CRC error
  • ...

Reproduction Steps

It's hard to reproduce. Somehow remove one TP segment from the communication.

Expected behaviour

In my opinion a missing segment should not invalidate all upcoming traffic.

The problem could be resolved by various actions:

  1. Lower the message reassembling timeout to less than the message frequency. So incomplete messages will be deleted before the next message arrives. The timeout is 5 seconds hardcoded at the moment and hence not very helpful.
  2. Force start of a new TP message by segments with offset zero. Remaining incomplete messages will be discarded. Other segments cannot start a new tp message.

I think the first solution is the better one, as it does not introduce as many implications on the order of the segments arriving.

Logs and Screenshots

No response

@siggie0815 siggie0815 added the bug label Jul 10, 2024
@siggie0815
Copy link
Author

After revising the SOME/IP TP Spec, I changed my mind: The spec is quite particular about when message reassembly should be interrupted and a new message should start. In other word vsomeip does not obey the specs in this regard.

https://www.autosar.org/fileadmin/standards/R20-11/CP/AUTOSAR_SWS_SOMEIPTransportProtocol.pdf

  • In section 7.3.1 it says that a message with offset 0 shall start a new disassembly session.
  • In section 7.3.3 it clearly makes sure that the segments have to be received in order.

I will try if I can fix the problems and file a pull request.

@duartenfonseca
Copy link
Collaborator

hi @siggie0815 could you try and test with this PR: #783 and see if this fixes the issue. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants