flasher: make unknown exceptions retriable#746
Conversation
✅ Deploy Preview for jumpstarter-docs ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
WalkthroughCentralized exception classification was added to the flasher client via Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller as Caller
participant Client as FlasherClient
participant Cat as _categorize_exception
participant Find as _find_exception_in_chain
participant Retry as Retry Loop
Caller->>Client: flash(request)
Client->>Client: perform operation
alt operation raises exception
Client->>Cat: _categorize_exception(exc)
Cat->>Find: inspect causes / ExceptionGroups
Find-->>Cat: matching exception found? / None
alt Non-retryable identified
Cat-->>Client: FlashNonRetryableError
Client->>Caller: raise FlashError (non-retryable)
else Retryable or wrapped as retryable
Cat-->>Client: FlashRetryableError
Client->>Retry: attempt < max?
alt retry allowed
Retry->>Retry: sleep 2**attempt
Retry->>Client: retry operation
else retries exhausted
Client->>Caller: raise FlashError (with categorized_error)
end
end
else success
Client->>Caller: return success
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
820113f to
6bced4a
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (2)
209-243: Categorization behavior matches priority rules; consider tightening type hintsThe categorization logic (non‑retryable > retryable > unknown wrapped as retryable with preserved
__cause__) is sound and aligns with the PR’s intent. As a minor refinement, you could make_find_exception_in_chaingeneric so_categorize_exception’s declared return type (FlashRetryableError | FlashNonRetryableError) aligns with what the helper returns, e.g. using aTypeVarbound toBaseExceptionandtype[TExc]instead of baretype.
245-281: Exception-chain traversal covers ExceptionGroups and causes; same optional typing tweak appliesThe recursive walk over
.exceptionsand__cause__handles both nested ExceptionGroups and standard cause chains well and is backed by tests. If you adopt the genericTypeVarapproach mentioned for_categorize_exception, this helper’s signature can return a precisetarget_typeinstance instead ofException | None, improving static checking without changing behavior.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.vscode/settings.json(1 hunks)packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py(2 hunks)packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 610
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:488-491
Timestamp: 2025-09-15T08:18:48.571Z
Learning: In the jumpstarter project, code review suggestions should stay focused on the specific scope of the PR. Suggestions about general improvements like timeout handling or error handling that are unrelated to the core changes being made should be avoided, even if they apply to modified code lines.
📚 Learning: 2025-11-05T13:45:58.271Z
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 735
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:15-15
Timestamp: 2025-11-05T13:45:58.271Z
Learning: In packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py, pexpect is intentionally used as a transitive dependency through the jumpstarter-driver-pyserial package. The flashers package does not declare pexpect as a direct dependency because the pyserial driver package is intended to control the pexpect version.
Applied to files:
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.pypackages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py
📚 Learning: 2025-11-05T13:31:39.304Z
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 735
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:100-101
Timestamp: 2025-11-05T13:31:39.304Z
Learning: In packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py, the `flash()` method and its CLI command intentionally have different defaults for `fls_version`: the method defaults to `""` (empty string) while the CLI defaults to `"0.1.5"`. This is by design to provide CLI users with a convenient default while keeping the programmatic API ready for when FLS is included in the flasher images.
Applied to files:
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py
🧬 Code graph analysis (1)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py (1)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (4)
FlashNonRetryableError(37-38)FlashRetryableError(33-34)_categorize_exception(209-243)_find_exception_in_chain(245-281)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.12)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: e2e
🔇 Additional comments (5)
.vscode/settings.json (1)
15-18: Editor-only settings change looks goodThe added
makefile.configureOnOpenandcursorpyright.analysis.extraPathsentries are valid JSON and scoped to tooling only; no runtime impact.packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (1)
180-201: Retry loop now correctly treats categorized errors and unknowns as retriableThe new retry loop cleanly separates non‑retryable (fail fast) from retryable errors and applies exponential backoff, while unknown exceptions are funneled through
_categorize_exceptionso they participate in the retry logic as intended.packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py (3)
4-22: MockFlasherClient logger shape matches production usageExtending
MockLoggerto accept*args, **kwargsand addingexceptionkeeps tests decoupled from logging details while supporting the newlogger.exception(...)call sites.
61-125: Categorization unit tests tightly cover primary behaviorsThe tests for direct non‑retryable/retryable errors, unknown exception wrapping (including cause preservation), and non‑retryable precedence over retryable in a cause chain give good confidence that
_categorize_exceptionbehaves as designed.
127-201: Chain and ExceptionGroup tests validate robustness of traversal logicThe
_find_exception_in_chaintests over simple cause chains, mock ExceptionGroups, nested groups, and aliasing (IOError/OSError) are thorough and ensure the new traversal logic is resilient to realistic exception structures.
de05a2e to
3b07bb8
Compare
|
tested manually and seems to be working well |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py (1)
63-200: Excellent test coverage for exception categorization logic.The test suite thoroughly exercises all paths through
_categorize_exceptionand_find_exception_in_chain, including edge cases like nested ExceptionGroups and cause chain traversal.Consider extracting the
MockExceptionGroupclass to module level to avoid duplication across test functions (Lines 158-161 and 176-179):# Near top of file, after imports class MockExceptionGroup(Exception): def __init__(self, message, exceptions): super().__init__(message) self.exceptions = exceptionsThen use it directly in both test functions without redefining it.
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (1)
210-248: Consider checking CancelledError in the entire exception chain.The method correctly categorizes exceptions with proper priority handling. However, the
CancelledErrorcheck at Line 238 only examines the top-level exception. If aCancelledErroris wrapped (e.g., as a__cause__), it will be missed and treated as retryable instead of non-retryable.For consistency with how
FlashRetryableErrorandFlashNonRetryableErrorare detected, consider searching the cause chain forCancelledError:# Second pass: look for retryable errors retryable = self._find_exception_in_chain(exception, FlashRetryableError) if retryable is not None: return retryable - # CancelledError is a special case that should be treated as non-retryable - if isinstance(exception, CancelledError): - return FlashNonRetryableError("Operation cancelled") + # CancelledError is a special case that should be treated as non-retryable + cancelled = self._find_exception_in_chain(exception, CancelledError) + if cancelled is not None: + return FlashNonRetryableError("Operation cancelled")Additionally, update the docstring to document the
CancelledErrorspecial case:Priority: 1. FlashNonRetryableError - highest priority, fail immediately 2. FlashRetryableError - retry with backoff - 3. Unknown exceptions - log full stack trace and treat as retryable + 3. CancelledError - treat as non-retryable + 4. Unknown exceptions - log full stack trace and treat as retryable
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py(4 hunks)packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py(3 hunks)
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 610
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:488-491
Timestamp: 2025-09-15T08:18:48.571Z
Learning: In the jumpstarter project, code review suggestions should stay focused on the specific scope of the PR. Suggestions about general improvements like timeout handling or error handling that are unrelated to the core changes being made should be avoided, even if they apply to modified code lines.
📚 Learning: 2025-11-05T13:45:58.271Z
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 735
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:15-15
Timestamp: 2025-11-05T13:45:58.271Z
Learning: In packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py, pexpect is intentionally used as a transitive dependency through the jumpstarter-driver-pyserial package. The flashers package does not declare pexpect as a direct dependency because the pyserial driver package is intended to control the pexpect version.
Applied to files:
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.pypackages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py
📚 Learning: 2025-11-05T13:31:39.304Z
Learnt from: mangelajo
Repo: jumpstarter-dev/jumpstarter PR: 735
File: packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py:100-101
Timestamp: 2025-11-05T13:31:39.304Z
Learning: In packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py, the `flash()` method and its CLI command intentionally have different defaults for `fls_version`: the method defaults to `""` (empty string) while the CLI defaults to `"0.1.5"`. This is by design to provide CLI users with a convenient default while keeping the programmatic API ready for when FLS is included in the flasher images.
Applied to files:
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py
🧬 Code graph analysis (1)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py (1)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (4)
FlashNonRetryableError(38-39)FlashRetryableError(34-35)_categorize_exception(210-248)_find_exception_in_chain(250-286)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: Redirect rules - jumpstarter-docs
- GitHub Check: Header rules - jumpstarter-docs
- GitHub Check: Pages changed - jumpstarter-docs
- GitHub Check: pytest-matrix (macos-15, 3.13)
- GitHub Check: pytest-matrix (macos-15, 3.11)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.13)
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.12)
- GitHub Check: e2e
- GitHub Check: pytest-matrix (ubuntu-24.04, 3.11)
🔇 Additional comments (6)
packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client_test.py (2)
1-2: LGTM!The import of
CancelledErroris necessary for testing the special handling of cancelled operations.
16-25: LGTM!Updating the MockLogger to accept
*argsand**kwargscorrectly aligns with real logger interfaces and supports the newexception()method used in the categorization logic.packages/jumpstarter-driver-flashers/jumpstarter_driver_flashers/client.py (4)
8-8: LGTM!The
CancelledErrorimport is correctly placed and necessary for treating cancelled operations as non-retryable.
181-202: LGTM!The updated retry logic correctly:
- Categorizes all exceptions through the centralized
_categorize_exceptionmethod- Fails immediately for non-retryable errors with proper logging
- Applies exponential backoff (2^attempt seconds) for retryable errors
- Preserves exception context with
from efor debuggingThe approach of treating unknown exceptions as retryable (by wrapping) is sound and aligns with the PR objectives.
250-286: LGTM!The
_find_exception_in_chainmethod correctly implements recursive searching through:
- ExceptionGroup members
- The current exception
- The cause chain (
__cause__)- Nested ExceptionGroups within the cause chain
The implementation handles all edge cases for exception chain traversal.
1166-1166: LGTM!The version bump to
0.1.9for the FLS default is appropriate. The TODO comment provides useful context about the temporary nature of this default.
| return result | ||
| current = getattr(current, '__cause__', None) | ||
| return None | ||
| def _find_exception_in_chain(self, exception: Exception, target_type: type) -> Exception | None: |
There was a problem hiding this comment.
We have the leaf_exceptions helper for flattening exception groups.
There was a problem hiding this comment.
oh wait I missed this... ok I will look into reusing the other implementation, hopefully they are compatible enough.
|
Successfully created backport PR for |
Fixes-Issue: jumpstarter-driver-flasher some flashing errors don't show correctly and won't retry #745
Summary by CodeRabbit
Bug Fixes
Tests
Chores