Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RisingWave Recovery Get Stuck if a hummock file is absent #19865

Open
pjpringlenom opened this issue Dec 19, 2024 · 12 comments
Open

RisingWave Recovery Get Stuck if a hummock file is absent #19865

pjpringlenom opened this issue Dec 19, 2024 · 12 comments
Assignees
Labels
type/bug Something isn't working
Milestone

Comments

@pjpringlenom
Copy link

Describe the bug

Issue appeared yesterday with writing a barrier due to storage error 404.
System looks to be working but is not processing any new data.
Logs are full of repeated 404 errors with no way to stop.
Grafana barrier completion time and failed recovery is incrementing.
Restarting meta, compute, gets stuck in SELECT rw_recovery_status() = 'STARTING'
Looking for a way to skip / fix this. Don't mind recreating state from upstream topics.

Error message/log

From rw_event_logs

COLLECT_BARRIER_FAIL	{"collectBarrierFail": {"curEpoch": "7685046509436928", "error": "get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {\"accept-ranges\": \"bytes\", \"content-length\": \"395\", \"content-type\": \"application/xml\", \"server\": \"MinIO Enterprise\", \"strict-transport-security\": \"max-age=31536000; includeSubDomains\", \"vary\": \"Origin\", \"vary\": \"Accept-Encoding\", \"x-amz-id-2\": \"e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c\", \"x-amz-request-id\": \"18122E4A720090DF\", \"x-content-type-options\": \"nosniff\", \"x-ratelimit-limit\": \"67422\", \"x-ratelimit-remaining\": \"67419\", \"x-xss-protection\": \"1; mode=block\", \"date\": \"Wed, 18 Dec 2024 05:28:23 GMT\"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: \"NoSuchKey\", message: \"The specified key does not exist.\", resource: \"/prw/hummock_12/247/17026641.data\", request_id: \"18122E4A720090DF\" }", "prevEpoch": "7685046443900928"}}


Meta

2024-12-19T06:31:58.539349982Z  INFO risingwave_meta::manager::sink_coordination::manager: sink manager worker start cleaning up
2024-12-19T06:31:58.539353746Z  INFO risingwave_meta::manager::sink_coordination::manager: sink manager worker finished cleaning up
2024-12-19T06:31:58.539358634Z  INFO failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::manager::sink_coordination::manager: successfully stop coordinator: None
2024-12-19T06:32:02.076996491Z  WARN failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::barrier::rpc: get error from response stream node=WorkerNode { id: 67, r#type: ComputeNode, host: Some(HostAddress { host: "eup024488", port: 5688 }), state: Running, property: Some(Property { is_streaming: true, is_serving: true, is_unschedulable: false, internal_rpc_host_addr: "" }), transactional_id: Some(2), resource: Some(Resource { rw_version: "2.0.1", total_memory_bytes: 33169780736, total_cpu_cores: 4 }), started_at: Some(1734164421), parallelism: 4, node_label: "" } err=gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4202 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "cb2190b708c970224f28019aa2dc6116cb65cd8a1c68ef7531f8b47d20806c3c", "x-amz-request-id": "1812805754FDE3F3", "x-content-type-options": "nosniff", "x-ratelimit-limit": "59672", "x-ratelimit-remaining": "59671", "x-xss-protection": "1; mode=block", "date": "Thu, 19 Dec 2024 06:31:59 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 83634-83806 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "1812805754FDE3F3" }
2024-12-19T06:32:04.326782835Z  INFO failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::barrier::recovery: recovering mview progress
2024-12-19T06:32:04.328330858Z  INFO failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::barrier::recovery: recovered mview progress
2024-12-19T06:32:04.390728625Z  WARN failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::manager::notification: Failed to notify local subscriber error=channel closed
2024-12-19T06:32:04.408186758Z  INFO failure_recovery{error=get error from control stream, in worker node 67: gRPC request to stream service failed: Internal error: failed to collect barrier: Actor 4176 exited unexpectedly: Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, context: { uri: http://minio:9000/prw/hummock_12/247/17026641.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"accept-ranges": "bytes", "content-length": "395", "content-type": "application/xml", "server": "MinIO Enterprise", "strict-transport-security": "max-age=31536000; includeSubDomains", "vary": "Origin", "vary": "Accept-Encoding", "x-amz-id-2": "e9bf55a4efee91119d4142302fbe76632d6cb930ee4c7ae449eb8a7ceae7fe8c", "x-amz-request-id": "18122E4A720090DF", "x-content-type-options": "nosniff", "x-ratelimit-limit": "67422", "x-ratelimit-remaining": "67419", "x-xss-protection": "1; mode=block", "date": "Wed, 18 Dec 2024 05:28:23 GMT"} }, service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error { code: "NoSuchKey", message: "The specified key does not exist.", resource: "/prw/hummock_12/247/17026641.data", request_id: "18122E4A720090DF" } prev_epoch=7685046378430464}:recovery_attempt: risingwave_meta::barrier::recovery: control stream reset elapsed=15.255692ms
2024-12-19T06:32:04.408224099Z  INFO risingwave_meta::manager::sink_coordination::manager: sink manager worker start cleaning up


Compute

2024-12-19T08:25:46.120957957Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.121025608Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.12109702Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.12116477Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.121178359Z ERROR risingwave_stream::task::stream_manager: actor exit with error actor_id=5662 error=Executor error: exchange channel from local upstream actor 5498 closed unexpectedly
2024-12-19T08:25:46.121195668Z ERROR risingwave_stream::task::stream_manager: actor exit with error actor_id=5782 error=Executor error: exchange channel from local upstream actor 5472 closed unexpectedly
2024-12-19T08:25:46.121231812Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.1212982Z ERROR actor{otel.name="Actor 5665" actor_id=5665}:executor{otel.name="StreamScan 16210000276D"}: risingwave_storage::monitor::monitored_store: Failed in get error=Hummock error: Foyer error: channel closed
2024-12-19T08:25:46.118280427Z ERROR risingwave_stream::task::stream_manager: actor exit with error actor_id=5471 error=Executor error: Storage error: Hummock error: Foyer error: channel closed

Backtrace:
   0: <thiserror_ext::backtrace::MaybeBacktrace as thiserror_ext::backtrace::WithBacktrace>::capture
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/thiserror-ext-0.1.2/src/backtrace.rs:30:18
   1: thiserror_ext::ptr::ErrorBox<T,B>::new
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/thiserror-ext-0.1.2/src/ptr.rs:40:33
   2: <risingwave_storage::hummock::error::HummockError as core::convert::From<E>>::from
             at ./risingwave/src/storage/src/hummock/error.rs:22:45
   3: <T as core::convert::Into<U>>::into
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/convert/mod.rs:759:9
   4: risingwave_storage::hummock::error::HummockError::foyer_error
             at ./risingwave/src/storage/src/hummock/error.rs:162:46
   5: core::ops::function::FnOnce::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/ops/function.rs:250:5
   6: core::result::Result<T,E>::map_err
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/result.rs:854:27
   7: risingwave_storage::hummock::sstable_store::SstableStore::sstable::{{closure}}
             at ./risingwave/src/storage/src/hummock/sstable_store.rs:600:22
   8: risingwave_storage::hummock::get_from_sstable_info::{{closure}}
             at ./risingwave/src/storage/src/hummock/mod.rs:78:72
   9: risingwave_storage::hummock::store::version::HummockVersionReader::get::{{closure}}
             at ./risingwave/src/storage/src/hummock/store/version.rs:671:22
  10: risingwave_storage::hummock::store::local_hummock_storage::LocalHummockStorage::get_inner::{{closure}}
             at ./risingwave/src/storage/src/hummock/store/local_hummock_storage.rs:134:14
  11: <risingwave_storage::hummock::store::local_hummock_storage::LocalHummockStorage as risingwave_storage::store::LocalStateStore>::get::{{closure}}
             at ./risingwave/src/storage/src/hummock/store/local_hummock_storage.rs:308:69
  12: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/future.rs:119:15
  13: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
  14: risingwave_storage::monitor::monitored_store::MonitoredStateStore<S>::monitored_get::{{closure}}
             at ./risingwave/src/storage/src/monitor/monitored_store.rs:139:14
  15: risingwave_stream::common::table::state_table::StateTableInner<S,SD,_,_>::get_encoded_row::{{closure}}
             at ./risingwave/src/stream/src/common/table/state_table.rs:859:14
  16: risingwave_stream::common::table::state_table::StateTableInner<S,SD,_,_>::get_row::{{closure}}
             at ./risingwave/src/stream/src/common/table/state_table.rs:815:67
  17: <F as futures_core::future::TryFuture>::try_poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/future.rs:82:9
  18: <futures_util::future::try_future::into_future::IntoFuture<Fut> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/try_future/into_future.rs:34:9
  19: <futures_util::stream::futures_ordered::OrderWrapper<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/futures_ordered.rs:56:9
  20: <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/futures_unordered/mod.rs:518:17
  21: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  22: <futures_util::stream::futures_ordered::FuturesOrdered<Fut> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/futures_ordered.rs:195:49
  23: <S as futures_core::stream::TryStream>::try_poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:196:9
  24: <futures_util::stream::try_stream::try_collect::TryCollect<St,C> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/try_collect.rs:46:26
  25: <futures_util::future::try_join_all::TryJoinAll<F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/try_join_all.rs:189:44
  26: risingwave_stream::executor::backfill::utils::get_progress_per_vnode::{{closure}}
             at ./risingwave/src/stream/src/executor/backfill/utils.rs:491:48
  27: risingwave_stream::executor::backfill::arrangement_backfill::ArrangementBackfillExecutor<S,SD>::execute_inner::{{closure}}
             at ./risingwave/src/stream/src/executor/backfill/arrangement_backfill.rs:142:76
  28: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  29: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  30: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  31: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  32: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  33: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/future.rs:119:15
  34: risingwave_stream::executor::wrapper::trace::instrument_await_tree::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:116:10
  35: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  36: risingwave_stream::executor::wrapper::schema_check::schema_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/schema_check.rs:24:1
  37: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  38: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  39: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  40: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  41: risingwave_stream::executor::wrapper::epoch_check::epoch_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_check.rs:31:44
  42: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  43: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  44: <S as futures_core::stream::TryStream>::try_poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:196:9
  45: futures_util::stream::try_stream::TryStreamExt::try_poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/mod.rs:1131:9
  46: <futures_util::stream::try_stream::try_next::TryNext<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/try_next.rs:32:9
  47: risingwave_stream::executor::wrapper::epoch_provide::epoch_provide::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_provide.rs:33:26
  48: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  49: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  50: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  51: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  52: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
  53: risingwave_stream::executor::wrapper::trace::trace::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:53:69
  54: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  55: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  56: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  57: risingwave_stream::executor::filter::FilterExecutor::execute_inner::{{closure}}
             at ./risingwave/src/stream/src/executor/filter.rs:134:5
  58: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  59: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  60: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  61: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  62: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  63: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/future.rs:119:15
  64: risingwave_stream::executor::wrapper::trace::instrument_await_tree::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:116:10
  65: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  66: risingwave_stream::executor::wrapper::schema_check::schema_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/schema_check.rs:24:1
  67: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  68: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  69: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  70: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  71: risingwave_stream::executor::wrapper::epoch_check::epoch_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_check.rs:31:44
  72: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  73: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  74: <S as futures_core::stream::TryStream>::try_poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:196:9
  75: futures_util::stream::try_stream::TryStreamExt::try_poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/mod.rs:1131:9
  76: <futures_util::stream::try_stream::try_next::TryNext<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/try_next.rs:32:9
  77: risingwave_stream::executor::wrapper::epoch_provide::epoch_provide::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_provide.rs:33:26
  78: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  79: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  80: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  81: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  82: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
  83: risingwave_stream::executor::wrapper::trace::trace::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:53:69
  84: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  85: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  86: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  87: risingwave_stream::executor::project::Inner::execute::{{closure}}
             at ./risingwave/src/stream/src/executor/project.rs:144:5
  88: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  89: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  90: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  91: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
  92: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
  93: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/future.rs:119:15
  94: risingwave_stream::executor::wrapper::trace::instrument_await_tree::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:116:10
  95: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  96: risingwave_stream::executor::wrapper::schema_check::schema_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/schema_check.rs:24:1
  97: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
  98: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
  99: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
 100: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
 101: risingwave_stream::executor::wrapper::epoch_check::epoch_check::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_check.rs:31:44
 102: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
 103: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
 104: <S as futures_core::stream::TryStream>::try_poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:196:9
 105: futures_util::stream::try_stream::TryStreamExt::try_poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/mod.rs:1131:9
 106: <futures_util::stream::try_stream::try_next::TryNext<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/try_stream/try_next.rs:32:9
 107: risingwave_stream::executor::wrapper::epoch_provide::epoch_provide::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/epoch_provide.rs:33:26
 108: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
 109: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
 110: futures_util::stream::stream::StreamExt::poll_next_unpin
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/mod.rs:1638:9
 111: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/stream/stream/next.rs:32:9
 112: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
 113: risingwave_stream::executor::wrapper::trace::trace::{{closure}}
             at ./risingwave/src/stream/src/executor/wrapper/trace.rs:53:69
 114: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
 115: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
 116: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
 117: <risingwave_stream::executor::dispatch::DispatchExecutor as risingwave_stream::executor::StreamConsumer>::execute::{{closure}}
             at ./risingwave/src/stream/src/executor/dispatch.rs:392:9
 118: <futures_async_stream::try_stream::GenTryStream<G> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-async-stream-0.2.11/src/lib.rs:492:33
 119: <core::pin::Pin<P> as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:120:9
 120: <&mut S as futures_core::stream::Stream>::poll_next
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-core-0.3.30/src/stream.rs:104:9
 121: <tokio_stream::stream_ext::next::Next<St> as core::future::future::Future>::poll
             at ./root/.cargo/git/checkouts/tokio-968c02b7a1a41bea-shallow/0dd1055/tokio-stream/src/stream_ext/next.rs:42:29
 122: <tokio_stream::stream_ext::try_next::TryNext<St> as core::future::future::Future>::poll
             at ./root/.cargo/git/checkouts/tokio-968c02b7a1a41bea-shallow/0dd1055/tokio-stream/src/stream_ext/try_next.rs:43:9
 123: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
 124: <await_tree::future::Instrumented<F,_> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/future.rs:119:15
 125: risingwave_stream::executor::actor::Actor<C>::run_consumer::{{closure}}
             at ./risingwave/src/stream/src/executor/actor.rs:233:18
 126: <tokio::future::maybe_done::MaybeDone<Fut> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/future/maybe_done.rs:62:56
 127: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}::{{closure}}::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/macros/join.rs:126:24
 128: <tokio::future::poll_fn::PollFn<F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/future/poll_fn.rs:58:9
 129: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}::{{closure}}
             at ./risingwave/src/stream/src/executor/actor.rs:183:17
 130: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:391:31
 131: tokio::task::task_local::LocalKey<T>::scope_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:217:19
 132: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:387:19
 133: risingwave_expr::expr_context::expr_context_scope::{{closure}}
             at ./risingwave/src/expr/core/src/expr_context.rs:35:65
 134: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:391:31
 135: tokio::task::task_local::LocalKey<T>::scope_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:217:19
 136: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:387:19
 137: risingwave_stream::executor::actor::Actor<C>::run::{{closure}}
             at ./risingwave/src/stream/src/executor/actor.rs:191:10
 138: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/future/map.rs:55:37
 139: <futures_util::future::future::Map<Fut,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/lib.rs:91:13
 140: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:391:31
 141: tokio::task::task_local::LocalKey<T>::scope_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:217:19
 142: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:387:19
 143: await_tree::root::TreeRoot::instrument::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/await-tree-0.2.1/src/root.rs:43:34
 144: <futures_util::future::either::Either<A,B> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/futures-util-0.3.30/src/future/either.rs:109:32
 145: core::ops::function::FnOnce::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/ops/function.rs:250:5
 146: tokio_metrics::task::instrument_poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-metrics-0.3.1/src/task.rs:2530:15
 147: <tokio_metrics::task::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-metrics-0.3.1/src/task.rs:2430:9
 148: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:391:31
 149: tokio::task::task_local::LocalKey<T>::scope_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:217:19
 150: <tokio::task::task_local::TaskLocalFuture<T,F> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/task/task_local.rs:387:19
 151: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
 152: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/core.rs:328:17
 153: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/loom/std/unsafe_cell.rs:16:9
 154: tokio::runtime::task::core::Core<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/core.rs:317:30
 155: tokio::runtime::task::harness::poll_future::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:485:19
 156: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/panic/unwind_safe.rs:272:9
 157: std::panicking::try::do_call
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:559:40
 158: std::panicking::try
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:523:19
 159: std::panic::catch_unwind
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panic.rs:149:14
 160: tokio::runtime::task::harness::poll_future
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:473:18
 161: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:208:27
 162: tokio::runtime::task::harness::Harness<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:153:15
 163: tokio::runtime::task::raw::RawTask::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/raw.rs:201:18
 164: tokio::runtime::task::LocalNotified<S>::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/mod.rs:427:9
 165: tokio::runtime::scheduler::multi_thread::worker::Context::run_task::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:585:18
 166: tokio::runtime::coop::with_budget
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:107:5
 167: tokio::runtime::coop::budget
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/coop.rs:73:5
 168: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:584:9
 169: tokio::runtime::scheduler::multi_thread::worker::Context::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:535:24
 170: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:500:21
 171: tokio::runtime::context::scoped::Scoped<T>::set
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/scoped.rs:40:9
 172: tokio::runtime::context::set_scheduler::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context.rs:180:26
 173: std::thread::local::LocalKey<T>::try_with
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/thread/local.rs:283:12
 174: std::thread::local::LocalKey<T>::with
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/thread/local.rs:260:9
 175: tokio::runtime::context::set_scheduler
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context.rs:180:17
 176: tokio::runtime::scheduler::multi_thread::worker::run::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:495:9
 177: tokio::runtime::context::runtime::enter_runtime
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/context/runtime.rs:65:16
 178: tokio::runtime::scheduler::multi_thread::worker::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:487:5
 179: tokio::runtime::scheduler::multi_thread::worker::Launch::launch::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/scheduler/multi_thread/worker.rs:455:45
 180: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/blocking/task.rs:42:21
 181: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tracing-0.1.40/src/instrument.rs:321:9
 182: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/core.rs:328:17
 183: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/loom/std/unsafe_cell.rs:16:9
 184: tokio::runtime::task::core::Core<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/core.rs:317:30
 185: tokio::runtime::task::harness::poll_future::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:485:19
 186: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/panic/unwind_safe.rs:272:9
 187: std::panicking::try::do_call
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:559:40
 188: std::panicking::try
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:523:19
 189: std::panic::catch_unwind
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panic.rs:149:14
 190: tokio::runtime::task::harness::poll_future
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:473:18
 191: tokio::runtime::task::harness::Harness<T,S>::poll_inner
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:208:27
 192: tokio::runtime::task::harness::Harness<T,S>::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/harness.rs:153:15
 193: tokio::runtime::task::raw::RawTask::poll
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/raw.rs:201:18
 194: tokio::runtime::task::UnownedTask<S>::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/task/mod.rs:464:9
 195: tokio::runtime::blocking::pool::Task::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/blocking/pool.rs:159:9
 196: tokio::runtime::blocking::pool::Inner::run
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/blocking/pool.rs:513:17
 197: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at ./root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/tokio-1.38.0/src/runtime/blocking/pool.rs:471:13
 198: std::sys_common::backtrace::__rust_begin_short_backtrace
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/sys_common/backtrace.rs:155:18
 199: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}}
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/thread/mod.rs:542:17
 200: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/panic/unwind_safe.rs:272:9
 201: std::panicking::try::do_call
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:559:40
 202: std::panicking::try
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panicking.rs:523:19
 203: std::panic::catch_unwind
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/panic.rs:149:14
 204: std::thread::Builder::spawn_unchecked_::{{closure}}
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/thread/mod.rs:541:30
 205: core::ops::function::FnOnce::call_once{{vtable.shim}}
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/core/src/ops/function.rs:250:5
 206: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/alloc/src/boxed.rs:2063:9
 207: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce<Args>>::call_once
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/alloc/src/boxed.rs:2063:9
 208: std::sys::pal::unix::thread::Thread::new::thread_start
             at ./rustc/72fdf913c53dd0e75313ba83e4aa80df3f6e2871/library/std/src/sys/pal/unix/thread.rs:108:17
 209: start_thread
             at ./nptl/pthread_create.c:447:8
 210: __GI___clone
             at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:100

To Reproduce

Looks to be transient storage issue which is non recoverable.

Expected behavior

Expected system to error to to give up trying to recover than being stuck in a loop for a day.

How did you deploy RisingWave?

Docker compose

The version of RisingWave

PostgreSQL 13.14.0-RisingWave-2.0.1 (0d15632)

Additional context

No response

@pjpringlenom pjpringlenom added the type/bug Something isn't working label Dec 19, 2024
@github-actions github-actions bot added this to the release-2.2 milestone Dec 19, 2024
@marceloneppel
Copy link
Contributor

marceloneppel commented Dec 26, 2024

It also happened to me today with RisingWave 2.1.0 deployed through the Kubernetes operator (0.8.3). I'm currently using Hetzner Object Storage.

@zwang28
Copy link
Contributor

zwang28 commented Dec 30, 2024

It also happened to me today with RisingWave 2.1.0 deployed through the Kubernetes operator (0.8.3). I'm currently using Hetzner Object Storage.

Hi @marceloneppel,

..., Executor error: Storage error: Hummock error: Foyer error: ObjectStore failed with IO error: NotFound (permanent) at read, ..., service: s3, path: hummock_12/247/17026641.data, range: 521721-524571 } => S3Error ..
Based on your log, similar to the example above, could you please provide this information for troubleshooting?

  1. The response to this query: select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=[OBJECT_ID]; where the [OBJECT_ID] should be located in the log, such as 17026641 for the example above.
  2. The range, which should be located in the log, such as range: 521721-524571 for the example above.

@marceloneppel
Copy link
Contributor

  1. select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=[OBJECT_ID];

Hi, @zwang28! Thanks for the support. I have attached the requested information.

Log:

2024-12-30T10:54:53.767016944Z ERROR risingwave_object_store::object: read failed error=NotFound (permanent) at read, context: { uri: https://neppel-prod-database.fsn1.your-objectstorage.com/risingwave/119/177975.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"content-length": "273", "x-amz-request-id": "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3", "accept-ranges": "bytes", "content-type": "application/xml", "date": "Mon, 30 Dec 2024 10:54:53 GMT", "x-debug-backend": "fsn1-prod1-ceph3", "strict-transport-security": "max-age=63072000", "x-debug-bucket": "neppel-prod-database"} }, service: s3, path: risingwave/119/177975.data, range: 33262700-33328157 } => S3Error { code: "NoSuchKey", message: "", resource: "", request_id: "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3" }

Response from the query:

prod=> select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=177975;

sstable_id | object_id | compaction_group_id | level_id | sub_level_id | level_type | right_exclusive | file_size | meta_offset | stale_key_count | total_key_count |    min_epoch     |    max_epoch     | uncompressed_file_size | range_tombstone_count | bloom_filter_kind |              table_ids              
------------+-----------+---------------------+----------+--------------+------------+-----------------+-----------+-------------+-----------------+-----------------+------------------+------------------+------------------------+-----------------------+-------------------+-------------------------------------
     178023 |    177975 |                   2 |        0 |        18158 |          2 | f               |  44595725 |    44340751 |               0 |          247603 | 7733539191324672 | 7733539191324672 |               44589578 |                     0 |                 1 | [104, 105, 106, 107, 108, 109, 110]
(1 row)

Range:

33262700-33328157

@hzxa21
Copy link
Collaborator

hzxa21 commented Dec 30, 2024

It also happened to me today with RisingWave 2.1.0 deployed through the Kubernetes operator (0.8.3). I'm currently using Hetzner Object Storage.

Can you check whether the object expiration lifecycle policy is set on the bucket used by RisingWave?
https://docs.hetzner.com/storage/object-storage/faq/buckets-objects#what-are-lifecycle-policies-and-how-do-i-use-them

@hzxa21
Copy link
Collaborator

hzxa21 commented Dec 30, 2024

  1. select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=[OBJECT_ID];

Hi, @zwang28! Thanks for the support. I have attached the requested information.

Log:

2024-12-30T10:54:53.767016944Z ERROR risingwave_object_store::object: read failed error=NotFound (permanent) at read, context: { uri: https://neppel-prod-database.fsn1.your-objectstorage.com/risingwave/119/177975.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"content-length": "273", "x-amz-request-id": "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3", "accept-ranges": "bytes", "content-type": "application/xml", "date": "Mon, 30 Dec 2024 10:54:53 GMT", "x-debug-backend": "fsn1-prod1-ceph3", "strict-transport-security": "max-age=63072000", "x-debug-bucket": "neppel-prod-database"} }, service: s3, path: risingwave/119/177975.data, range: 33262700-33328157 } => S3Error { code: "NoSuchKey", message: "", resource: "", request_id: "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3" }

Response from the query:

prod=> select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=177975;

sstable_id | object_id | compaction_group_id | level_id | sub_level_id | level_type | right_exclusive | file_size | meta_offset | stale_key_count | total_key_count |    min_epoch     |    max_epoch     | uncompressed_file_size | range_tombstone_count | bloom_filter_kind |              table_ids              
------------+-----------+---------------------+----------+--------------+------------+-----------------+-----------+-------------+-----------------+-----------------+------------------+------------------+------------------------+-----------------------+-------------------+-------------------------------------
     178023 |    177975 |                   2 |        0 |        18158 |          2 | f               |  44595725 |    44340751 |               0 |          247603 | 7733539191324672 | 7733539191324672 |               44589578 |                     0 |                 1 | [104, 105, 106, 107, 108, 109, 110]
(1 row)

Range:

33262700-33328157

If you have the logs for meta and compactor or the issue is reproduced, please also search the affecting object id (in this example it is 177975) from the meta and conpactor logs and share the relevant log lines.

@marceloneppel
Copy link
Contributor

It also happened to me today with RisingWave 2.1.0 deployed through the Kubernetes operator (0.8.3). I'm currently using Hetzner Object Storage.

Can you check whether the object expiration lifecycle policy is set on the bucket used by RisingWave? https://docs.hetzner.com/storage/object-storage/faq/buckets-objects#what-are-lifecycle-policies-and-how-do-i-use-them

Sure. There is no lifecycle policy set in the bucket:

~ aws s3api get-bucket-lifecycle-configuration --endpoint-url=https://fsn1.your-objectstorage.com --bucket neppel-prod-database --debug
...
2024-12-30 14:02:16,544 - MainThread - botocore.parsers - DEBUG - Response body:
b'<?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchLifecycleConfiguration</Code><Message></Message><BucketName>neppel-prod-database</BucketName><RequestId>tx0000031b25906a11cc5a6-00677299d8-93eda0e-fsn1-prod1-ceph3</RequestId><HostId>93eda0e-fsn1-prod1-ceph3-fsn1-prod1</HostId></Error>'
...
``

@marceloneppel
Copy link
Contributor

  1. select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=[OBJECT_ID];

Hi, @zwang28! Thanks for the support. I have attached the requested information.
Log:

2024-12-30T10:54:53.767016944Z ERROR risingwave_object_store::object: read failed error=NotFound (permanent) at read, context: { uri: https://neppel-prod-database.fsn1.your-objectstorage.com/risingwave/119/177975.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"content-length": "273", "x-amz-request-id": "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3", "accept-ranges": "bytes", "content-type": "application/xml", "date": "Mon, 30 Dec 2024 10:54:53 GMT", "x-debug-backend": "fsn1-prod1-ceph3", "strict-transport-security": "max-age=63072000", "x-debug-bucket": "neppel-prod-database"} }, service: s3, path: risingwave/119/177975.data, range: 33262700-33328157 } => S3Error { code: "NoSuchKey", message: "", resource: "", request_id: "tx0000075af18d11e51e05e-0067727bfd-8b2eee4-fsn1-prod1-ceph3" }

Response from the query:

prod=> select sstable_id,object_id,compaction_group_id,level_id,sub_level_id,level_type,right_exclusive,file_size,meta_offset,stale_key_count,total_key_count,min_epoch,max_epoch,uncompressed_file_size,range_tombstone_count,bloom_filter_kind,table_ids from rw_hummock_sstables where object_id=177975;

sstable_id | object_id | compaction_group_id | level_id | sub_level_id | level_type | right_exclusive | file_size | meta_offset | stale_key_count | total_key_count |    min_epoch     |    max_epoch     | uncompressed_file_size | range_tombstone_count | bloom_filter_kind |              table_ids              
------------+-----------+---------------------+----------+--------------+------------+-----------------+-----------+-------------+-----------------+-----------------+------------------+------------------+------------------------+-----------------------+-------------------+-------------------------------------
     178023 |    177975 |                   2 |        0 |        18158 |          2 | f               |  44595725 |    44340751 |               0 |          247603 | 7733539191324672 | 7733539191324672 |               44589578 |                     0 |                 1 | [104, 105, 106, 107, 108, 109, 110]
(1 row)

Range:

33262700-33328157

If you have the logs for meta and compactor or the issue is reproduced, please also search the affecting object id (in this example it is 177975) from the meta and conpactor logs and share the relevant log lines.

Only the compactor has logs for that object id (no meta logs mentioned it):

kubectl logs -n prod risingwave-compactor-9468d469-6sfnn --tail=100 | grep 177975

2024-12-31T02:05:00.537287941Z  WARN opendal::services: service=s3 name=neppel-prod-database path=risingwave/119/177975.data: stat failed NotFound (permanent) at stat, context: { uri: https://neppel-prod-database.fsn1.your-objectstorage.com/risingwave/119/177975.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"content-length": "273", "x-amz-request-id": "tx000000554b65963cc87d7-006773514c-9266d8c-fsn1-prod1-ceph3", "accept-ranges": "bytes", "content-type": "application/xml", "date": "Tue, 31 Dec 2024 02:05:00 GMT", "x-debug-backend": "fsn1-prod1-ceph3", "strict-transport-security": "max-age=63072000", "x-debug-bucket": "neppel-prod-database"} }, service: s3, path: risingwave/119/177975.data }    
2024-12-31T02:05:00.537480289Z ERROR risingwave_object_store::object: read failed error=NotFound (permanent) at stat, context: { uri: https://neppel-prod-database.fsn1.your-objectstorage.com/risingwave/119/177975.data, response: Parts { status: 404, version: HTTP/1.1, headers: {"content-length": "273", "x-amz-request-id": "tx000000554b65963cc87d7-006773514c-9266d8c-fsn1-prod1-ceph3", "accept-ranges": "bytes", "content-type": "application/xml", "date": "Tue, 31 Dec 2024 02:05:00 GMT", "x-debug-backend": "fsn1-prod1-ceph3", "strict-transport-security": "max-age=63072000", "x-debug-bucket": "neppel-prod-database"} }, service: s3, path: risingwave/119/177975.data }
Level 0 ["[id: 178023, obj_id: 177975 object_size 43550KB sst_size 42283KB stale_ratio 0]"]

@zwang28
Copy link
Contributor

zwang28 commented Dec 31, 2024

Hi @marceloneppel , we've reached out to you on Slack for quicker communication. Please check it when you have a moment.

@maingoh
Copy link

maingoh commented Jan 7, 2025

Hello, it looks like I am affected by the same issue. All compotents are looking for some NotFound hummock files (on GCS).
I have played around with the operator, updating the config, sometimes deleting the pods manually or doing a rollout on the statefulsets. It may have stored the object id in the table, but abruptly stopped the upload of the file on GCS (atomicity issue ?).

If I delete the corresponding rows in rw_hummock_sstables, will it recover ? Will the data be lost forever ?

@zwang28
Copy link
Contributor

zwang28 commented Jan 8, 2025

Hi @maingoh , we've reached out to you on Slack for quicker communication. Please check it when you have a moment.

@maingoh
Copy link

maingoh commented Jan 10, 2025

It seem my issue was a bit different from the initial issue but I will still give my resolution in case it happens to someone else.

As I was playing with the RisingWave CRD, I wanted to make sure the configmap I was using was taken into account. But my cluster has already been initialized once, so I cleared the system_parameter table in PG meta. When it is being repopulated from the configmap the param use_new_object_prefix_strategy switched from true to false (even though it wasn't specified in the configmap). So RW was looking for blobs in hummock/57792807.data instead of hummock/221/57792807.data, but the file was not missing. To fix it:

  • Stop the cluster
  • Run UPDATE system_parameter SET value=true WHERE name='use_new_object_prefix_strategy'; in PG meta
  • Start the cluster

Thank you @zwang28 for the debugging !!

@zwang28
Copy link
Contributor

zwang28 commented Jan 13, 2025

For anyone experiencing the Hummock error: Foyer error: ObjectStore failed with IO error: NotFound in v2.1.0 or v2.1.1,

  • Please upgrade to v2.1.2, which includes the fix.
  • You may also need to recreate some of the streaming jobs to recover the cluster. You can find me in slack for help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants