Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MD5 Verification for Solana Validator Snapshot Downloads #4630

Open
YukiCoco opened this issue Jan 25, 2025 · 1 comment
Open

Add MD5 Verification for Solana Validator Snapshot Downloads #4630

YukiCoco opened this issue Jan 25, 2025 · 1 comment

Comments

@YukiCoco
Copy link

Problem

Currently, the validator crashes when processing corrupted snapshots. Could we add MD5 checksum verification when fetching snapshots from other nodes?

ress=62i time_io_ms=5560i time_io_weighted_ms=471770i discards_completed=0i discards_merged=0i sectors_discarded=0i time_discarding=0i flushes_completed=1i time_flushing=8i num_disks=2i
[2025-01-25T09:05:46.297537480Z INFO  solana_accounts_db::shared_buffer_reader] reading entire decompressed file took: 125446008 us, bytes: 427031920640, read_us: 120118310, waiting_for_buffer_us: 4938928, largest fetch: 11403264, error: Err(Custom { kind: Other, error: "Data corruption detected" })
thread 'solUnpkSnpsht01' panicked at runtime/src/snapshot_utils.rs:1587:14:
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: Other, error: "Data corruption detected" })
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'solUnpkSnpsht02' panicked at runtime/src/snapshot_utils.rs:[2025-01-25T09:05:46.335025189Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solUnpkSnpsht01" one=1i message="panicked at runtime/src/snapshot_utils.rs:1587:14:
    called `Result::unwrap()` on an `Err` value: Io(Custom { kind: Other, error: \"Data corruption detected\" })" location="runtime/src/snapshot_utils.rs:1587:14" version="2.1.7 (src:3940aa0d; feat:1793238286, client:JitoLabs)"
1587:14:
called `Result::unwrap()` on an `Err` value: Io(Kind(TimedOut))
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'solUnpkSnpsht00' panicked at runtime/src/snapshot_utils.rs[2025-01-25T09:05:46.335359389Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solUnpkSnpsht02" one=1i message="panicked at runtime/src/snapshot_utils.rs:1587:14:
    called `Result::unwrap()` on an `Err` value: Io(Kind(TimedOut))" location="runtime/src/snapshot_utils.rs:1587:14" version="2.1.7 (src:3940aa0d; feat:1793238286, client:JitoLabs)"
:1587:14:
called `Result::unwrap()` on an `Err` value: Io(Kind(TimedOut))
[2025-01-25T09:05:46.339242450Z INFO  solana_runtime::snapshot_utils::snapshot_storage_rebuilder] rebuilt storages for 429554/429754 slots with 0 collisions
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
thread 'solUnpkSnpsht03' panicked at [2025-01-25T09:05:46.341068062Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solUnpkSnpsht00" one=1i message="panicked at runtime/src/snapshot_utils.rs:1587:14:
    called `Result::unwrap()` on an `Err` value: Io(Kind(TimedOut))" location="runtime/src/snapshot_utils.rs:1587:14" version="2.1.7 (src:3940aa0d; feat:1793238286, client:JitoLabs)"
runtime/src/snapshot_utils.rs:1587:14:
called `Result::unwrap()` on an `Err` value: Io(Custom { kind: TimedOut, error: TarError { desc: "failed to unpack `accounts/315815381.14527051` into `/mnt/accounts/run/315815381.14527051`", io: Kind(TimedOut) } })
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2025-01-25T09:05:46.344531055Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solUnpkSnpsht03" one=1i message="panicked at runtime/src/snapshot_utils.rs:1587:14:
    called `Result::unwrap()` on an `Err` value: Io(Custom { kind: TimedOut, error: TarError { desc: \"failed to unpack `accounts/315815381.14527051` into `/mnt/accounts/run/315815381.14527051`\", io: Kind(TimedOut) } })" location="runtime/src/snapshot_utils.rs:1587:14" version="2.1.7 (src:3940aa0d; feat:1793238286, client:JitoLabs)"

Proposed Solution

@yourarj
Copy link

yourarj commented Jan 30, 2025

@YukiCoco Can elaborate more on this. What ideal implementation would like?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants