Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include FAPEC compressor support to Parquet? #3067

Open
PortellJ opened this issue Nov 18, 2024 · 1 comment
Open

Include FAPEC compressor support to Parquet? #3067

PortellJ opened this issue Nov 18, 2024 · 1 comment

Comments

@PortellJ
Copy link

Describe the enhancement requested

FAPEC is a high-performance data compression algorithm with many options, based on efficient entropy coding and including several pre-processing algorithms for time series, images, text, floats, etc.
It's already available for some formats like HDF5 and FITS. We're now investigating the possibility to include FAPEC as a new codec option for Parquet, which (we think) should provide better compression ratios (than the currently available codecs) mainly for integers and floats/doubles. We're now working on a proof-of-concept to evaluate how much would it actually improve.
If the outcome reveals a "significant enough" improvement, will it be possible to include this new option?
Note that FAPEC is a commercial product, so a valid license would be needed to generate Parquet files with this, and the compression library would have to be used in binary form for the adequate platform. (reading Parquet files compressed with FAPEC would always be free). Perhaps this would be a blocking issue...?

Component(s)

No response

@wgtmac
Copy link
Member

wgtmac commented Nov 18, 2024

Thanks for creating an issue wrt to a new compression codec. There was a similar discussion happened on [email protected]: https://lists.apache.org/thread/ht95wm8trfx2z4pq91t7170t2qjqg4yw. I think the replies have provided some general concerns of adding a new codec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants