Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two question:
1: will it automatically decompress data when read to arrow record batch? Do we need to introduce decompress code logic?
2: will it handle all compress type that fluss supports, like ltz?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@luoyuxia
1 → Yes, Arrow automatically decompresses data when reading into a
RecordBatch.From the Arrow IPC docs: https://arrow.apache.org/rust/arrow_ipc/index.html
2 → Arrow IPC supports LZ4 and ZSTD.
In the Fluss docs, I've also seen support for LZ4 and ZSTD:
table.log.arrow.compression.typecan beNONE,LZ4_FRAME, orZSTD(default:ZSTD).Decompression happens at a lower layer than the Fluss logical types which is transparent for the most part, meaning it can handle all types the arrow supports. For the non-standard types like ltz (I assume you mean
TIMESTAMP_LTZ), this must be handled in the Rust binding code since you need to parse the metadata stored in the arrow timestamp which has information about the precision and the timezone. I already have another PR (not submitted) adding experimental support forltz+ timestamps.But also found PR #53 , which implements all types including ltz timestamp. Since this is more holistic PR than what I was planning to submit, it’s worth reviewing and merging this PR instead for adding support for ltz.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@binary-signal Thanks for explantion. Make sense to me