Skip to content

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

@railway-app
Copy link

railway-app bot commented Jan 16, 2026

🚅 Deployed to the rivet-pr-3928 environment in rivet-frontend

Service Status Web Updated (UTC)
website ✅ Success (View Logs) Web Jan 16, 2026 at 11:56 pm
frontend-inspector 😴 Sleeping (View Logs) Web Jan 16, 2026 at 11:14 am
frontend-cloud 😴 Sleeping (View Logs) Web Jan 16, 2026 at 10:54 am

@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3928 January 16, 2026 02:31 Destroyed
Copy link
Contributor Author

MasterPtato commented Jan 16, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 16, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@3928

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@3928

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@3928

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@3928

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@3928

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@3928

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@3928

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@3928

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@3928

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@3928

commit: cbdc9b0

@claude
Copy link

claude bot commented Jan 16, 2026

PR Review: Add Actor and KV Metrics

This PR adds comprehensive metrics tracking for Rivet Actors and KV operations. The changes consolidate KV key management, add namespace-scoped metrics, and update Grafana dashboards.

🎯 Summary

Scope: Adding metrics infrastructure for actor lifecycle and KV operations
Files Changed: 59 files (excluding generated Grafana dashboards: ~15 code files)
Key Changes:

  • Consolidated actor KV key management from separate package into pegboard
  • Added namespace-scoped metrics for KV operations (read/write/storage)
  • Added actor lifecycle metrics (awake time, total actors, etc.)
  • Updated metric serialization to use little-endian consistently

✅ Strengths

  1. Good consolidation: Moving actor-kv code into pegboard reduces fragmentation
  2. Consistent serialization: Using little-endian (LE) bytes consistently across metrics
  3. Proper metric granularity: Metrics are scoped by namespace and actor name
  4. Billable chunk tracking: KV operations properly track billable chunks (4 KiB)

🐛 Critical Issues

1. Compilation Error in setup.rs ⚠️

Location: engine/packages/pegboard/src/workflows/actor/setup.rs

Lines 119, 127, and 236 reference undefined variable name:

// Line 119, 127
keys::ns::metric::inc(
    tx,
    input.namespace_id,
    keys::ns::metric::Metric::TotalActors(name.clone()),  // ❌ 'name' is not defined
    1,
);

Fix: Should use input.name.clone() instead of name.clone()

Impact: This will prevent the code from compiling.


2. Duplicate Metric Increment ⚠️

Location: engine/packages/pegboard/src/workflows/actor/setup.rs:116-129

The exact same metric increment appears twice:

// Update metrics
keys::ns::metric::inc(
    tx,
    input.namespace_id,
    keys::ns::metric::Metric::TotalActors(name.clone()),
    1,
);

// Update metrics  (duplicate\!)
keys::ns::metric::inc(
    tx,
    input.namespace_id,
    keys::ns::metric::Metric::TotalActors(name.clone()),
    1,
);

Impact: This will incorrectly double-count actor creation metrics.

Fix: Remove one of the duplicate blocks.


3. Broken Operation Implementation ⚠️

Location: engine/packages/pegboard/src/ops/actor/get_for_runner.rs

Multiple issues in this file:

a) Undefined variable actor_id (lines 30-32):

let name_key = keys::actor::NameKey::new(actor_id);  // ❌ should be input.actor_id
let runner_id_key = keys::actor::RunnerIdKey::new(actor_id);
let connectable_key = keys::actor::ConnectableKey::new(actor_id);

b) Return type mismatch (line 44):

return Ok(None);  // ❌ returns None but unpacks as tuple on line 25

The function signature indicates it returns Option<Output>, but line 25 tries to unpack the result as a tuple:

let (name, runner_id, is_connectable) = ctx.udb()?.run(...).await?;

c) Undefined type Actor (lines 44, 86):

Ok(Some(Actor { name, runner_id, is_connectable }))  // ❌ Should be 'Output'

d) Undefined variable actors_with_wf_ids (line 62):

actors_with_wf_ids.iter()  // ❌ This variable doesn't exist

e) Undefined variables in error log (line 78):

tracing::error\!(?actor_id, ?workflow_id, ?err, ...)  // ❌ workflow_id not defined

Impact: This entire operation is broken and will not compile.

Recommendation: This file appears to be incomplete or incorrectly merged. Review the original get_runner.rs implementation (which was deleted) and properly port the logic.


🔍 Code Quality Issues

4. Inconsistent Deserialization Type

Location: engine/packages/pegboard/src/keys/ns/metric.rs:85

fn deserialize(&self, raw: &[u8]) -> Result<Self::Value> {
    Ok(usize::from_le_bytes(raw.try_into()?))  // ❌ Returns usize but type is i64
}

The return type is i64 but deserialization uses usize. This should be:

Ok(i64::from_le_bytes(raw.try_into()?))

Impact: This could cause type conversion issues or silent truncation on 32-bit systems.


5. Potential Overflow in .try_into()

Locations: Multiple files use .try_into().unwrap_or_default() for metrics

total_size_chunked.try_into().unwrap_or_default()

Context: total_size_chunked is usize, converting to i64

Issue: On 64-bit systems, if total_size_chunked exceeds i64::MAX, this silently returns 0 instead of the actual value.

Recommendation: Either:

  • Use proper error handling: `.try_into().context("metric value overflow")?"
  • Or cast with saturation: .min(i64::MAX as usize) as i64

Impact: Metrics could silently report 0 for very large operations.


6. Missing Variable in backfill_udb_keys_and_metrics

Location: engine/packages/pegboard/src/workflows/actor/setup.rs:232-238

keys::ns::metric::inc(
    tx,
    input.namespace_id,  // ❌ 'namespace_id' not in input struct
    keys::ns::metric::Metric::TotalActors(name.clone()),  // ❌ 'name' undefined
    1,
);

The BackfillUdbKeysAndMetricsInput struct only has actor_id. Should use state.namespace_id and state.name.


📝 Documentation & Style

7. Comment Quality

Location: engine/packages/pegboard/src/keys/ns/metric.rs:6-26

The metric enum has inconsistent comment formatting:

/// Seconds (actor name)
ActorAwake(String),
/// Count (actor name)
TotalActors(String),
// Count (actor name)  // ❌ Missing '///'
Requests(String, String),

Recommendation: Use consistent doc comments (///) for all variants.


8. Magic Number

Location: engine/packages/pegboard/src/actor_kv/mod.rs:22

const VALUE_CHUNK_SIZE: usize = 10_000; // 10 KB, not KiB, see ...

Good: The comment explaining KB vs KiB is helpful.

Suggestion: Consider defining this constant in a central location if it's used across multiple packages.


🔒 Security & Performance

9. No Security Issues Identified

The KV operations properly:

  • Validate input sizes
  • Scope access by actor_id
  • Use transactions correctly

10. Performance Consideration

Location: engine/packages/pegboard/src/actor_kv/mod.rs:295-296

// TODO: Costly clone
let key = keys::actor_kv::KeyWrapper(keys.get(i).context("index should exist")?.clone());

Observation: The TODO is acknowledged. Consider if the clone can be avoided by restructuring the data flow.


✅ Test Coverage

Files Added:

  • engine/packages/pegboard/tests/kv_list_edge_cases.rs
  • engine/packages/pegboard/tests/kv_operations.rs

Status: Files exist but appear empty (0 additions/deletions).

Recommendation: Ensure these test files contain actual test cases for:

  • KV operations (get, put, delete, list)
  • Metric increments
  • Edge cases in list operations
  • Chunk handling for large values

📊 Overall Assessment

Category Rating Notes
Functionality 🔴 Critical Multiple compilation errors prevent merge
Code Quality 🟡 Moderate Some type inconsistencies and duplicates
Security 🟢 Good No issues identified
Performance 🟢 Good Appropriate use of chunking and transactions
Testing 🔴 Critical Test files appear empty
Documentation 🟡 Moderate Some inconsistent comments

🚀 Required Actions Before Merge

  1. Fix compilation errors in setup.rs:

    • Replace name.clone() with input.name.clone() (3 occurrences)
    • Remove duplicate metric increment (lines 123-129)
  2. Fix get_for_runner.rs:

    • Use input.actor_id instead of actor_id
    • Fix return type inconsistency
    • Replace Actor with Output
    • Remove or fix undefined variable references
    • OR: Consider if this file should be reverted/rewritten
  3. Fix type inconsistency in metric.rs:85:

    • Use i64::from_le_bytes instead of usize::from_le_bytes
  4. Fix backfill_udb_keys_and_metrics:

    • Use state.namespace_id and state.name.clone()
  5. Add test coverage:

    • Implement tests in the empty test files
    • Cover KV operations and metric tracking
  6. Consider improving overflow handling:

    • Replace .unwrap_or_default() with proper error handling or saturating cast

💡 Suggestions for Follow-up

  1. Metric aggregation: Consider adding a background job to aggregate metrics periodically
  2. Storage metrics: The KvStorageUsed metric is defined but may need periodic recalculation
  3. Actor awake time tracking: Ensure there's a mechanism to track and update ActorAwake hours
  4. Documentation: Add examples of how to query these metrics from Grafana

Conclusion: This PR cannot be merged in its current state due to compilation errors. The core concept and architecture are solid, but implementation issues need to be resolved first.

@MasterPtato MasterPtato force-pushed the 01-15-fix_day_1_patch_add_dashboards branch from ee27a16 to 192c800 Compare January 16, 2026 23:55
@MasterPtato MasterPtato force-pushed the 01-15-feat_add_actor_and_kv_metrics branch from 88935a1 to cbdc9b0 Compare January 16, 2026 23:55
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3928 January 16, 2026 23:55 Destroyed
@MasterPtato MasterPtato force-pushed the 01-15-feat_add_actor_and_kv_metrics branch from cbdc9b0 to 67c4a86 Compare January 16, 2026 23:56
@railway-app railway-app bot temporarily deployed to rivet-frontend / rivet-pr-3928 January 16, 2026 23:56 Destroyed
@MasterPtato MasterPtato marked this pull request as ready for review January 16, 2026 23:56
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 16, 2026

Merge activity

  • Jan 16, 11:57 PM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Jan 16, 11:57 PM UTC: CI is running for this pull request on a draft pull request (#3950) due to your merge queue CI optimization settings.
  • Jan 16, 11:58 PM UTC: Merged by the Graphite merge queue via draft PR: #3950.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants