Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create ParquetReader for Mapping Parquet Bytes to Java Maps #26

Open
criccomini opened this issue May 25, 2023 · 0 comments
Open

Create ParquetReader for Mapping Parquet Bytes to Java Maps #26

criccomini opened this issue May 25, 2023 · 0 comments

Comments

@criccomini
Copy link
Owner

Issue Description:

Proposal:
Introduce a ParquetReader in Twister to enable the mapping of Parquet bytes to Java Maps, similar to AvroReader and ProtoReader. This reader will provide a convenient way to read Parquet data and extract it as key-value pairs stored in Maps.

Expected Behavior:
The ParquetReader should allow developers to read Parquet bytes and map the data to Java Maps. Each Parquet row will be represented as a Map object, where column names are used as keys and corresponding values are populated.

Benefits:

  1. Simplified Parquet data processing.
  2. Seamless integration with existing Java data structures.
  3. Enhanced performance and efficiency.

Implementation Considerations:
Utilize existing Parquet libraries, such as Apache Parquet, to handle low-level parsing and decoding. Support various data types defined in the Parquet schema and handle nullable fields appropriately.

Contributor Resources:
Refer to the Twister project's contribution guidelines for instructions on setting up the development environment and submitting a pull request.

Environment:

  • Twister version: [Specify the version or commit hash]
  • Operating System: [Specify the OS]
  • Java version: [Specify the Java version]
  • Additional environment details: [Provide any relevant details about the environment]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant