fix: hang on rapid start/stop threads#323
Conversation
The screensharing window could hang the main event loop when it was closed and then opened again. This could happen because the close could happen while the redraw_thread was in request_redraw, which needs the main event to be free in order to continue and at the same time a new open would wait from the main event loop for the previous redraw thread to join. This change removes all the joins from the codebase as they don't really needed for protecting us from zombie threads.
✅ Deploy Preview for hoppdocs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis PR refactors thread lifecycle management across the entire core application: removing explicit ChangesThread Lifecycle Refactoring
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/window/screensharing_window.rs (1)
804-840:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftRebuild participant state when reusing the screensharing window.
participantsis only used to recomputesharer_name;participants_managerkeeps the previous session's cursors, paths, and drawing modes. Reopening the same window for a new share can therefore render stale overlays until separate connect/disconnect events happen, which reconnect flows do not guarantee.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/window/screensharing_window.rs` around lines 804 - 840, The participants slice is currently only used to set sharer_name, leaving self.participants_manager from the prior session intact and causing stale cursors/paths/drawing modes; inside update_window_with_new_sharer you should reset/rebuild participant state in participants_manager when reusing the window: call the manager's reset/clear method (or implement one) and then repopulate it from the provided participants (creating fresh participant entries, cursors, path lists, and drawing-mode defaults), while keeping the existing resets to self.state (sharer_name, current_path_id, remote_control_allowed, etc.), and ensuring any references used by the redraw thread are updated before spawn_redraw_thread is called.
🧹 Nitpick comments (1)
core/src/input/mouse_windows.rs (1)
164-164: 💤 Low valuePre-existing race on
EVENT_CALLBACKis slightly widened by detached threads.With the hook thread now detached, if
MouseObserveris dropped and immediately recreated, there's a small window where the old thread'sEVENT_CALLBACK = None(line 231) could clear the new thread's callback. The existing code comment (lines 29-33) already acknowledges this risk. Since the hang fix is the higher priority and the race window is narrow, this is acceptable for now but worth noting for future hardening with proper synchronization.Also applies to: 230-232
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/input/mouse_windows.rs` at line 164, Detached hook thread can outlive a recreated MouseObserver and let the old thread clear the global EVENT_CALLBACK; to fix, stop detaching the thread spawned in the spawn closure and instead store its JoinHandle on the MouseObserver struct (keep reference created by std::thread::spawn) and join or signal and join that handle in MouseObserver::drop so the old thread cannot run after EVENT_CALLBACK has been replaced; alternatively, replace the global EVENT_CALLBACK with a synchronized shared handle (e.g., Arc<Mutex<Option<...>>> or an atomic token) and ensure the spawned closure checks that token before clearing the callback so the race is eliminated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@core/src/lib.rs`:
- Around line 515-519: The close_camera_window helper should hide the native
window before detaching its redraw worker: when you take the CameraWindow from
self.camera_window in close_camera_window, call its hide API (e.g.,
CameraWindow::hide or a method that hides its inner Window/sets visibility)
while you still own cam, then call cam.stop_redraw_thread(); if CameraWindow
lacks a hide method, add one that hides the inner Window/sets visibility so the
native window is not visible for an extra frame before the redraw thread exits.
In `@core/src/window/screensharing_window.rs`:
- Around line 2005-2010: The Stop message can be starved by queued ForceRedraws
on the same FIFO channel; change the shutdown path to use a dedicated,
low-latency signal instead of relying on redraw_tx. Add a new shutdown sender
(e.g. shutdown_tx) that the redraw worker also selects on, have
stop_redraw_thread send the shutdown signal over shutdown_tx (or close it) and
then take redraw_thread; update the worker loop to listen for shutdown_tx
alongside redraw_rx so Stop cannot be blocked by ForceRedraw messages. Ensure
you update the struct to hold shutdown_tx/shutdown_rx and modify the redraw loop
and stop_redraw_thread accordingly (references: stop_redraw_thread, redraw_tx,
RedrawCommand::Stop, redraw_thread, ForceRedraw).
In `@core/tests/src/hang_repro.rs`:
- Line 15: The loop's drain branch never executes because MUTE_UNMUTE_CYCLES is
5 while the condition checks if i % 10 == 0, so adjust either the constant or
the condition so the drain runs: update MUTE_UNMUTE_CYCLES to 10 or change the
check in the hot loop (the if i % 10 == 0 that is intended to drain video_rx) to
use a value derived from MUTE_UNMUTE_CYCLES (e.g., if i % (MUTE_UNMUTE_CYCLES *
2) == 0) so video_rx is actually drained during the loop in hang_repro.rs.
- Around line 102-116: The variable use_av1 is only defined for macOS and
Windows but later used unconditionally by max_bitrate and video_codec; make
use_av1 defined for all targets (e.g., let use_av1 = cfg!(target_os =
"windows"); or evaluate cfg!(target_os = "macos")/cfg!(target_os = "windows") as
needed) so that max_bitrate (H264_BITRATE_DEFAULT) and video_codec
(VideoCodec::AV1 / VideoCodec::H264) compile on non-macOS/non-Windows platforms;
update the declaration of use_av1 near where max_bitrate and video_codec are
computed so the identifier is always in scope.
---
Outside diff comments:
In `@core/src/window/screensharing_window.rs`:
- Around line 804-840: The participants slice is currently only used to set
sharer_name, leaving self.participants_manager from the prior session intact and
causing stale cursors/paths/drawing modes; inside update_window_with_new_sharer
you should reset/rebuild participant state in participants_manager when reusing
the window: call the manager's reset/clear method (or implement one) and then
repopulate it from the provided participants (creating fresh participant
entries, cursors, path lists, and drawing-mode defaults), while keeping the
existing resets to self.state (sharer_name, current_path_id,
remote_control_allowed, etc.), and ensuring any references used by the redraw
thread are updated before spawn_redraw_thread is called.
---
Nitpick comments:
In `@core/src/input/mouse_windows.rs`:
- Line 164: Detached hook thread can outlive a recreated MouseObserver and let
the old thread clear the global EVENT_CALLBACK; to fix, stop detaching the
thread spawned in the spawn closure and instead store its JoinHandle on the
MouseObserver struct (keep reference created by std::thread::spawn) and join or
signal and join that handle in MouseObserver::drop so the old thread cannot run
after EVENT_CALLBACK has been replaced; alternatively, replace the global
EVENT_CALLBACK with a synchronized shared handle (e.g., Arc<Mutex<Option<...>>>
or an atomic token) and ensure the spawned closure checks that token before
clearing the callback so the race is eliminated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 78c78195-fbd0-452c-8cdc-9c4ea3212ef4
📒 Files selected for processing (15)
core/socket_lib/src/lib.rscore/src/audio/capturer.rscore/src/camera/stream_nokhwa.rscore/src/capture/capturer.rscore/src/capture/stream.rscore/src/graphics/graphics_context.rscore/src/input/mouse_macos.rscore/src/input/mouse_windows.rscore/src/lib.rscore/src/window/camera_window.rscore/src/window/drawing_window.rscore/src/window/screensharing_window.rscore/tests/src/hang_repro.rscore/tests/src/main.rstauri/src-tauri/tauri.conf.json
Summary by CodeRabbit
Bug Fixes
Chores