Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic on small *.parquet file #848

Closed
krumpira opened this issue Jul 16, 2024 · 1 comment · Fixed by #849
Closed

Panic on small *.parquet file #848

krumpira opened this issue Jul 16, 2024 · 1 comment · Fixed by #849
Assignees
Labels
bug Something isn't working server stability

Comments

@krumpira
Copy link

Hello kind people of Parseable :)
I have encountered an error on version 1.3.0 of Parseable, similar to #830 which you already fixed in v1.3.0.
This time it was a small (<10B) parquet file.

Error and stack backtrace:

thread '<unnamed>' panicked at server/src/storage/object_storage.rs:490:85:
called `Result::unwrap()` on an `Err` value: Parquet error: Invalid Parquet file. Size is smaller than footer

Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: parseable::catalog::manifest::create_from_parquet_file
   2: parseable::storage::object_storage::ObjectStorage::sync::{{closure}}
   3: parseable::sync::object_store_sync::{{closure}}::{{closure}}::{{closure}}::{{closure}}::{{closure}}
   4: <clokwerk::async_scheduler::AsyncSchedulerFuture as core::future::future::Future>::poll
   5: <tokio::task::local::RunUntil<T> as core::future::future::Future>::poll
   6: tokio::runtime::runtime::Runtime::block_on
   7: std::panicking::try
   8: std::sys_common::backtrace::__rust_begin_short_backtrace
   9: core::ops::function::FnOnce::call_once{{vtable.shim}}
  10: std::sys::pal::unix::thread::Thread::new::thread_start
  11: <unknown>
  12: clone
stack backtrace:
   0:     0x563989d72fd6 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9cca0343d66d16a8
   1:     0x563989da0c40 - core::fmt::write::h4311bce0ee536615
   2:     0x563989d6fa7f - std::io::Write::write_fmt::h0685c51539d0a0cd
   3:     0x563989d72db4 - std::sys_common::backtrace::print::h2fb8f70628a241ed
   4:     0x563989d74637 - std::panicking::default_hook::{{closure}}::h05093fe2e3ef454d
   5:     0x563989d74399 - std::panicking::default_hook::h5ac38aa38e0086d2
   6:     0x563989d74ac8 - std::panicking::rust_panic_with_hook::hed79743dc8b4b969
   7:     0x563989d749a2 - std::panicking::begin_panic_handler::{{closure}}::ha437b5d58f431abf
   8:     0x563989d734d6 - std::sys_common::backtrace::__rust_end_short_backtrace::hd98e82d5b39ec859
   9:     0x563989d746f4 - rust_begin_unwind
  10:     0x563986fd3945 - core::panicking::panic_fmt::hc69c4d258fe11477
  11:     0x563986fd3e93 - core::result::unwrap_failed::hff299ec748d62aab
  12:     0x563987819a8c - parseable::storage::object_storage::ObjectStorage::sync::{{closure}}::hc3970bafa8610334
  13:     0x56398738d577 - parseable::sync::object_store_sync::{{closure}}::{{closure}}::{{closure}}::{{closure}}::{{closure}}::hb173344c807ee3d2
  14:     0x563987a9bcc5 - <clokwerk::async_scheduler::AsyncSchedulerFuture as core::future::future::Future>::poll::h93925fcc6b55d1ab
  15:     0x563987539cd3 - <tokio::task::local::RunUntil<T> as core::future::future::Future>::poll::h058dcc70a0d46f4b
  16:     0x563987356ba3 - tokio::runtime::runtime::Runtime::block_on::hc5935087007a9cff
  17:     0x563987789c75 - std::panicking::try::h41cef33c8949afdf
  18:     0x56398760ba24 - std::sys_common::backtrace::__rust_begin_short_backtrace::h7a3436071f169cf5
  19:     0x56398761020c - core::ops::function::FnOnce::call_once{{vtable.shim}}::h9ec81a60cd0ed021
  20:     0x563989d788d5 - std::sys::pal::unix::thread::Thread::new::thread_start::h40e6fd3f8ce15a14
  21:     0x7ff4d6514134 - <unknown>
  22:     0x7ff4d6593a40 - clone
  23:                0x0 - <unknown>

Issue was worked around by finding the small file and removing it:

$ find /path/to/staging/ -type f -size -10c
/path/to/staging/streamname/date=2024-07-15.hour=12.minute=40.f.q.d.n.data.bIPhuanTft8RVUa.parquet
$ rm -f /path/to/staging/streamname/date=2024-07-15.hour=12.minute=40.f.q.d.n.data.bIPhuanTft8RVUa.parquet
@nitisht nitisht added the bug Something isn't working label Jul 16, 2024
@nitisht
Copy link
Member

nitisht commented Jul 16, 2024

Thanks for reporting @krumpira - we'll take a look shortly

nikhilsinhaparseable added a commit to nikhilsinhaparseable/parseable that referenced this issue Jul 16, 2024
delete invalid parquets where file size is less than the length of the parquet footer

Fixes: parseablehq#848
nitisht pushed a commit that referenced this issue Jul 16, 2024
delete invalid parquets where file size is less than 
the length of the parquet footer

Fixes: #848
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working server stability
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants