Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add usage documentation for the Java library #2914

Open
asfimport opened this issue Jun 5, 2024 · 0 comments
Open

Add usage documentation for the Java library #2914

asfimport opened this issue Jun 5, 2024 · 0 comments

Comments

@asfimport
Copy link
Collaborator

The Java parquet library has no usage documentation besides the sparse information available in the README. The only thing I could find were a few old (10yr) 3rd party tutorials scattered on the internet using the hadoop module. I spent a work day sifting through the API docs and searching on the internet to try to piece together something. Ultimately, I decided to give up on doing Parquet files using Java because there are alternative file formats that are better documented, and I felt trying to use parquet-mr would be a huge hassle to maintain in the future. This library seems reasonably maintained and comprehensive, but there is just a huge barrier to using the library which I think turns off a lot of developers like me.

I kindly request usage documentation be written to cover all the major aspects of the library, and for the more nitty gritty use cases, pointers to what API classes/methods could be looked at further.

I may be misunderstanding the purpose of this library, and if so, is there a different Java Parquet library that is recommended for higher level parquet file IO?

Reporter: Isaac Nygaard

Note: This issue was originally created as PARQUET-2490. Please see the migration documentation for further details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant