Skip to content

Conversation

@Nemo157
Copy link
Contributor

@Nemo157 Nemo157 commented Oct 29, 2025

In some situations the ipc channels from actors to the background task can get overloaded. Currently we don't have a complete plan for how to handle this. But the major source of messages is telemetry, which is also one we can treat less reliably, so by dividing this from the data messages we can deprioritize them.

Relates to #149

@Nemo157

This comment was marked as outdated.

@github-actions
Copy link

github-actions bot commented Oct 29, 2025

@codecov
Copy link

codecov bot commented Oct 29, 2025

Codecov Report

❌ Patch coverage is 5.40541% with 35 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
veecle-ipc/src/connector.rs 0.00% 30 Missing ⚠️
veecle-ipc/src/actors/output.rs 0.00% 5 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Nemo157 Nemo157 force-pushed the wim/push-nxwvpqvwwzvz branch from ee68ea8 to c226a0b Compare November 3, 2025 11:21
In some situations the ipc channels from actors to the background task
can get overloaded. Currently we don't have a complete plan for how to
handle this. But the major source of messages is telemetry, which is
also one we can treat less reliably, so by dividing this from the data
messages we can deprioritize them.

Signed-off-by: Wim Looman <[email protected]>
@Nemo157 Nemo157 force-pushed the wim/push-nxwvpqvwwzvz branch from c226a0b to 6bd83d0 Compare November 10, 2025 09:41
@Nemo157 Nemo157 marked this pull request as ready for review November 10, 2025 12:41
@Nemo157 Nemo157 requested review from alexveecle and vE5li November 10, 2025 12:41
@claude
Copy link

claude bot commented Nov 10, 2025

Change Summary

This PR decouples IPC data channels by separating storable messages, telemetry messages, and control requests into distinct channels. The implementation introduces OutputTx and OutputRx structs that manage three separate MPSC channels with independent buffering. A prioritized recv() method uses tokio::select! with biased polling to drain control requests first, then storable messages, then telemetry. This allows telemetry (which can be chatty) to be dropped when overloaded without affecting critical storable or control messages.

Issues Found

🔴 Critical

Duplicate dependency in Cargo.toml
The veecle-telemetry dependency appears twice in dev-dependencies (lines 28 and 31 of Cargo.toml), both with identical feature sets. The second entry at line 31 should be removed as it's redundant.

🟡 Important

Comment inaccuracy regarding buffer size
Line 59 in connector.rs:59 states "Telemetry can be quite chatty, so give it a large buffer" but then creates a channel with capacity 128, which is identical to the storable channel. Either the buffer size should be increased to match the comment, or the comment should be corrected to reflect that all data channels have equal capacity.

Missing rationale for channel priorities
The prioritization order (control > storable > telemetry) in OutputRx::recv() is critical to system behavior but lacks explanation. The comment at line 37-38 mentions potential starvation of low-priority channels but doesn't explain why this specific ordering was chosen or what the operational implications are.

Copy link
Contributor

@vE5li vE5li left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

Copy link
Contributor

@alexveecle alexveecle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this looks good to me; however... does this need a changelog entry or some addition to the docs? Is this part of a broader initiative? I see no issue associated?

Also, wasn't there issues with telemetry becoming lost if some message was lost? Does this mean that telemetry might be more unreliable? (I think this is completely fine- but perhaps we should warn somewhere that telemetry is "best effort" or something.)

@Nemo157
Copy link
Contributor Author

Nemo157 commented Nov 11, 2025

Yes, if telemetry messages are lost that can cause issues with rendering them (currently the ui itself panics, it could be more resilient for some kinds of missing messages, but there is context-dependence that mean it still may not make sense afterwards).

I do think we need more improvements, this is just a way to try to avoid some situations that can cause panics currently.

Added an issue link.

@Nemo157 Nemo157 added this pull request to the merge queue Nov 11, 2025
Merged via the queue into main with commit 667a8fb Nov 11, 2025
42 of 44 checks passed
@Nemo157 Nemo157 deleted the wim/push-nxwvpqvwwzvz branch November 11, 2025 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants