Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate RowGroup.file_offset #394

Open
asfimport opened this issue Aug 30, 2021 · 5 comments
Open

Deprecate RowGroup.file_offset #394

asfimport opened this issue Aug 30, 2021 · 5 comments

Comments

@asfimport
Copy link
Collaborator

asfimport commented Aug 30, 2021

Due to PARQUET-2078 RowGroup.file_offset is not reliable.

This field is also wrongly calculated in the C++ oss parquet implementation PARQUET-2089

Reporter: Gabor Szadovszky / @gszadovszky
Assignee: Gidon Gershinsky / @ggershinsky

Note: This issue was originally created as PARQUET-2080. Please see the migration documentation for further details.

@asfimport
Copy link
Collaborator Author

Gabor Szadovszky / @gszadovszky:
@ggershinsky, however the original topic of this jira is invalid we still need to add proper comments to RowGroup.file_offset describing the situation of PARQUET-2078 and helping the implementations to handle the potential wrong value. Would you like to handle this?

@asfimport
Copy link
Collaborator Author

Gidon Gershinsky / @ggershinsky:
@gszadovszky  yes, I'll take it. There might be a different solution (also format-related) that bypasses the need to calculate such parameter in any implementation, so it can be fully deprecated. I'll get back with the details and we'll discuss the trade-offs.

@asfimport
Copy link
Collaborator Author

Gidon Gershinsky / @ggershinsky:
Hi @gszadovszky , I've prepared a short writeup on this alternative solution, with a discussion of the tradeoffs. After writing it, my feeling is that the trade-off is not in favor of this alternative option; but here it goes, just to cover all bases. Will appreciate your opinion on this.

@asfimport
Copy link
Collaborator Author

Gabor Szadovszky / @gszadovszky:
@ggershinsky, could you make the doc available for comments?

@asfimport
Copy link
Collaborator Author

Gidon Gershinsky / @ggershinsky:
Oh, sorry, done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant