Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@railway-app
Copy link

railway-app bot commented Jan 16, 2026

🚅 Deployed to the rivet-pr-3948 environment in rivet-frontend

Service Status Web Updated (UTC)
frontend-inspector 😴 Sleeping (View Logs) Web Jan 17, 2026 at 12:04 am
website 😴 Sleeping (View Logs) Web Jan 17, 2026 at 12:03 am
frontend-cloud 😴 Sleeping (View Logs) Web Jan 17, 2026 at 12:02 am

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3948 January 16, 2026 23:54 Destroyed
Copy link
Contributor Author

MasterPtato commented Jan 16, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link

claude bot commented Jan 16, 2026

PR Review: fix(gas, pb, api): make .removed events noop if dont match, add check for dead wfs to actor/runner errors, make api error types unknown

Summary

This PR makes three main improvements:

  1. Makes .removed() events no-ops when they do not match existing events (instead of erroring)
  2. Adds detection for "dead" workflows (incomplete with no wake condition) to return internal errors
  3. Simplifies API error types to use unknown instead of generating complex discriminated union types

Code Quality & Best Practices

Strengths

  1. Proper Error Handling Pattern: The new is_dead() check in db/mod.rs follows Rust best practices - clean, composable, and well-documented.

  2. Consistent Structured Logging: The code follows CLAUDE.md guidelines for structured logging with lowercase messages and proper context.

  3. Good Comment Quality: Comments are written as complete sentences as specified in CLAUDE.md guidelines.

  4. Well-Organized Functions: The resolve_actor_error and fetch_runner_pool_errors functions have clear single responsibilities.

Potential Issues

1. Silent Failure in compare_removed (gasoline/src/history/cursor.rs:1134-1139):

The code silently ignores mismatched removed events without advancing the cursor. While this is the intended behavior per the PR title, it could mask issues where workflow code has diverged from history.

Recommendation: Consider adding a metric or counter for ignored removed events to help detect potential workflow drift in production.

2. Missing Context in Dead Workflow Detection (pegboard/src/ops/actor/util.rs:49-50):

The InternalError does not preserve information about why the workflow is dead, making debugging difficult.

Recommendation: Add logging context before returning the error:

let error = if wf.is_dead() {
    tracing::warn!(?actor_id, ?workflow_id, "actor workflow is dead (no wake condition)");
    Some(rivet_types::actor::ActorError::InternalError)
}

Performance Considerations

  1. Efficient Caching: Good use of short TTL (500ms) for frequently changing error states in runner_config/get_error.rs. The fetch_all_json batches requests efficiently.

  2. Optimal Filtering: Using HashSet to deduplicate runners before fetching is good. The conversion chain in actor/util.rs:91-102 could be slightly simplified with HashSet::from_iter but the current approach is fine.

Security Concerns

No security issues identified. The changes:

  • Do not introduce injection vulnerabilities
  • Handle errors safely without exposing sensitive information
  • Use proper deserialization with type checking

Test Coverage

Missing Test Coverage: The PR does not include tests for the new behavior.

Recommended test additions:

  1. Test for .removed() with mismatched events (verify no-op behavior)
  2. Test for is_dead() workflow detection
  3. Test for fetch_runner_pool_errors with dead workflows

Additional Observations

  1. API Breaking Change: The OpenAPI changes show a large deletion count (2970 lines). The error types are now marked as unknown which simplifies the API but loses type safety on the client side. This is a trade-off between API complexity and type safety.

  2. Consistent Error Handling: The new InternalError variant is used consistently across both ActorError and RunnerPoolError types.

  3. Good Separation of Concerns: The resolve_actor_error function cleanly separates error resolution logic from actor building logic.

Verdict

Approve with minor recommendations

The code is well-structured and follows project conventions. The main improvements needed are:

  1. Add observability (metrics/logs) for ignored removed events
  2. Add logging context when dead workflows are detected
  3. Add test coverage for new behavior

The changes are functionally sound and address real-world edge cases in workflow execution.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 16, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3948

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3948

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3948

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3948

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3948

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3948

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3948

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3948

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3948

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3948

commit: ccf33ec

@MasterPtato MasterPtato changed the base branch from main to graphite-base/3948 January 16, 2026 23:57
@MasterPtato MasterPtato force-pushed the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch from 3087c76 to a0b314c Compare January 16, 2026 23:57
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3948 January 16, 2026 23:57 Destroyed
@MasterPtato MasterPtato changed the base branch from graphite-base/3948 to 01-15-feat_add_actor_and_kv_metrics January 16, 2026 23:57
@graphite-app graphite-app bot changed the base branch from 01-15-feat_add_actor_and_kv_metrics to graphite-base/3948 January 16, 2026 23:58
@graphite-app graphite-app bot force-pushed the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch from a0b314c to 9520b6b Compare January 16, 2026 23:59
@graphite-app graphite-app bot force-pushed the graphite-base/3948 branch from 67c4a86 to 93d5dd9 Compare January 16, 2026 23:59
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3948 January 16, 2026 23:59 Destroyed
@graphite-app graphite-app bot changed the base branch from graphite-base/3948 to main January 17, 2026 00:00
… for dead wfs to actor/runner errors, make api error types unknown
@graphite-app graphite-app bot force-pushed the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch from 9520b6b to ccf33ec Compare January 17, 2026 00:00
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3948 January 17, 2026 00:00 Destroyed
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 17, 2026

Merge activity

  • Jan 17, 12:07 AM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Jan 17, 12:08 AM UTC: CI is running for this pull request on a draft pull request (#3951) due to your merge queue CI optimization settings.
  • Jan 17, 12:09 AM UTC: Merged by the Graphite merge queue via draft PR: #3951.

graphite-app bot pushed a commit that referenced this pull request Jan 17, 2026
… for dead wfs to actor/runner errors, make api error types unknown (#3948)
@graphite-app graphite-app bot closed this Jan 17, 2026
@graphite-app graphite-app bot deleted the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch January 17, 2026 00:09
@MasterPtato MasterPtato restored the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch January 17, 2026 00:14
@MasterPtato MasterPtato deleted the 01-16-fix_gas_pb_api_make_.removed_events_noop_if_dont_match_add_check_for_dead_wfs_to_actor_runner_errors_make_api_error_types_unknown branch January 17, 2026 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants