Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -34,5 +34,5 @@ members = ["crates/fluss", "crates/examples", "bindings/python"]
fluss = { version = "0.1.0", path = "./crates/fluss" }
tokio = { version = "1.44.2", features = ["full"] }
clap = { version = "4.5.37", features = ["derive"] }
arrow = "57.0.0"
arrow = { version = "57.0.0", features = ["ipc_compression"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two question:
1: will it automatically decompress data when read to arrow record batch? Do we need to introduce decompress code logic?
2: will it handle all compress type that fluss supports, like ltz?

Copy link
Contributor Author

@binary-signal binary-signal Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luoyuxia
1 → Yes, Arrow automatically decompresses data when reading into a RecordBatch.

From the Arrow IPC docs: https://arrow.apache.org/rust/arrow_ipc/index.html

The Arrow IPC format defines how to read/write RecordBatches to/from files or byte streams. It handles serialization and deserialization.

2 → Arrow IPC supports LZ4 and ZSTD.
In the Fluss docs, I've also seen support for LZ4 and ZSTD:

table.log.arrow.compression.type can be NONE, LZ4_FRAME, or ZSTD (default: ZSTD).

Decompression happens at a lower layer than the Fluss logical types which is transparent for the most part, meaning it can handle all types the arrow supports. For the non-standard types like ltz (I assume you mean TIMESTAMP_LTZ), this must be handled in the Rust binding code since you need to parse the metadata stored in the arrow timestamp which has information about the precision and the timezone. I already have another PR (not submitted) adding experimental support for ltz + timestamps.

But also found PR #53 , which implements all types including ltz timestamp. Since this is more holistic PR than what I was planning to submit, it’s worth reviewing and merging this PR instead for adding support for ltz.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@binary-signal Thanks for explantion. Make sense to me

chrono = { version = "0.4", features = ["clock", "std", "wasmbind"] }
Loading