Skip to content

Conversation

@rambleraptor
Copy link
Contributor

Which issue does this PR close?

Part of #7806

Rationale for this change

The byte_range() method on ColumnChunkMetaData previously panicked if the column start or length were negative. This is undesirable in a library, as it can lead to unrecoverable errors in consumer applications.

This change modifies byte_range() to return a Result<(u64, u64)> instead, allowing callers to handle invalid byte ranges gracefully.

What changes are included in this PR?

  • The byte_range() method in parquet/src/file/metadata/mod.rs now returns a Result.
  • Call sites of byte_range() have been updated to handle the Result using the ? operator or unwrap() in tests. This affects:
    • parquet/examples/read_with_rowgroup.rs
    • parquet/src/arrow/async_reader/mod.rs
    • parquet/src/file/serialized_reader.rs

Are these changes tested?

The changes are covered by existing tests. The modifications in the test files (parquet/src/arrow/async_reader/mod.rs) now use unwrap() on the byte_range() call, which maintains the existing test behavior of panicking on error.

Are there any user-facing changes?

Yes, the signature of parquet::file::metadata::ColumnChunkMetaData::byte_range() has changed from -> (u64, u64) to -> Result<(u64, u64)>. This is a breaking change for users who call this method directly.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Oct 22, 2025
@etseidl etseidl added the api-change Changes to the arrow API label Oct 24, 2025
@etseidl
Copy link
Contributor

etseidl commented Oct 24, 2025

I think this one might fall under the "move the error to validation" category (see #6737 (comment)), but the current ColumnMetaDataBuilder doesn't provide much opportunity for that either. The public builders should be changed to do validation in the setters.

Is this one the result of some testing or just a search for assert? If the former, then rather than this breaking change, perhaps add a try_byte_range or some such that returns Result and add a note that the existing byte_range may panic. If the latter, then we can probably let this alone for now and instead focus on fixing the builders for the 58.0.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api-change Changes to the arrow API parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants