-
Notifications
You must be signed in to change notification settings - Fork 850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Decimal32 and Decimal64 #7061
base: main
Are you sure you want to change the base?
Conversation
…into decimal3264
…rquet fails (apache#6886) * Minor: make it easier to find instructions when fmt fails * purposely introduce a fmt issue * Revert "purposely introduce a fmt issue" This reverts commit 440e520. * Update .github/workflows/rust.yml Co-authored-by: Ed Seidl <[email protected]> --------- Co-authored-by: Ed Seidl <[email protected]>
* Minor: add comments explaining bad MSRV * purposely introduce msrv brek * output in JSON format * Revert "purposely introduce msrv brek" This reverts commit 61872b6.
* Add 54.4.0 to release schedule * prettoer
* Add deprecation / API removal policy * Increase proposal to 2 releases * change from policy to guidelines, add flexibility * prettier * Make instructions more actionable
* add function to create ProjectionMask from column names * add some more tests
* doc: add comment for timezone string Signed-off-by: xxchan <[email protected]> * Update arrow-schema/src/datatype.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> --------- Signed-off-by: xxchan <[email protected]> Co-authored-by: Raphael Taylor-Davies <[email protected]>
* Update version to 54.0.0 * Update changelog * update notes * updtes * update
apache#6875) * add `extend_dictionary` in dictionary builder for improved performance * fix extends all nulls * support null in mapped value * adding comment * run `clippy` and `fmt` * fix ci * Apply suggestions from code review Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: Andrew Lamb <[email protected]>
* [object_store]: Version and Changelog for 0.11.2 * increment version * update script * changelog * tweaks * Update object_store/CHANGELOG.md Co-authored-by: Raphael Taylor-Davies <[email protected]> --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
…pache#6907) * feat(parquet): Add next_row_group API for ParquetRecordBatchStream Signed-off-by: Xuanwo <[email protected]> * chore: Returning error instead of using unreachable Signed-off-by: Xuanwo <[email protected]> --------- Signed-off-by: Xuanwo <[email protected]>
…ide of ArrowWriter (apache#6916)
…he#6849) * [arrow-string] Implement string view suport for regexp match Signed-off-by: Tai Le Manh <[email protected]> * update unit tests * fix clippy warnings * Add test cases Signed-off-by: Tai Le Manh <[email protected]> --------- Signed-off-by: Tai Le Manh <[email protected]>
* Add doctest example for * Remove typo * Update arrow-buffer/src/buffer/immutable.rs --------- Co-authored-by: Andrew Lamb <[email protected]>
* object_store: Add `thiserror` dependency * object_store/memory: Migrate from `snafu` to `thiserror` * object_store/parse: Migrate from `snafu` to `thiserror` * object_store/util: Migrate from `snafu` to `thiserror` * object_store/local: Migrate from `snafu` to `thiserror` * object_store/delimited: Migrate from `snafu` to `thiserror` * object_store/path/parts: Migrate from `snafu` to `thiserror` * object_store/path: Migrate from `snafu` to `thiserror` * object_store/http: Migrate from `snafu` to `thiserror` * object_store/client: Migrate from `snafu` to `thiserror` * object_store/aws: Migrate from `snafu` to `thiserror` * object_store/azure: Migrate from `snafu` to `thiserror` * object_store/gcp: Migrate from `snafu` to `thiserror` * object_store/lib: Migrate from `snafu` to `thiserror` * Remove `snafu` dependency
* feat: add GenericListViewBuilder * remove uszie * fix tests * remove static * lint * chore: add comment for should fail test * Update arrow-array/src/builder/generic_list_view_builder.rs Co-authored-by: Marco Neumann <[email protected]> * Update arrow-array/src/builder/generic_list_view_builder.rs Co-authored-by: Marco Neumann <[email protected]> * fix name & lint --------- Co-authored-by: Marco Neumann <[email protected]>
…pache#6925) Updates the requirements on [itertools](https://github.com/rust-itertools/itertools) to permit the latest version. - [Changelog](https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md) - [Commits](rust-itertools/itertools@v0.13.0...v0.14.0) --- updated-dependencies: - dependency-name: itertools dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…he#6932) * chore: add docs for how to use Extend for generic methods on ArrayBuilders * chore: move to mod docs and add more examples
…rom<Bytes>` and `From<bytes::Bytes>` impls (apache#6939) * Improve Bytes documentation * Improve Buffer documentation, add From<Bytes> and From<bytes::Bytes> impls * avoid linking to private docs * Deprecate `Buffer::from_bytes` * Apply suggestions from code review Co-authored-by: Jeffrey Vo <[email protected]> --------- Co-authored-by: Jeffrey Vo <[email protected]>
…ults (apache#6738) * Reduce panics * t pushmove integer logical type from format.rs to schema type.rs * remove some changes as per reviews * use wrapping_shl * fix typo in error message * return error for invalid decimal length --------- Co-authored-by: jp0317 <[email protected]> Co-authored-by: Andrew Lamb <[email protected]>
* Update most MSRVs * Make cargo-msrv verify every package in repo instead of just a select few and purposefully break arrow-flight msrv * Add test to ensure workspace rust version is being used at least somewhere * Fix exit1 => exit 1 * Make arrow-flight work, at the very least, with 'cargo metadata' * Fix arrow-flight/gen rust-version to make CI pass now * Get rid of pretty msrv logging as it can't all be displayed * Do '-mindepth 2' with find to prevent running cargo msrv on the workspace as a whole * Use correct MSRV for object_store * remove workspace msrv check * revert msrv * push object_store MSRV back down to 1.62.1 * Revert unrelated formatting changes * Fix object_store msrv --------- Co-authored-by: Andrew Lamb <[email protected]> Co-authored-by: Jeffrey Vo <[email protected]>
* Document the ParquetRecordBatchStream buffering * Update parquet/src/arrow/async_reader/mod.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
…pache#6619) Co-authored-by: Andrew Lamb <[email protected]>
* reuse buffer in view array * Update parquet/src/arrow/array_reader/byte_view_array.rs Co-authored-by: Raphael Taylor-Davies <[email protected]> * use From<Bytes> instead --------- Co-authored-by: Raphael Taylor-Davies <[email protected]>
* regenerate arrow-ipc/src/gen with patched flatbuffers * use git repo instead of local path * add backticks * expand allowed overage to accommodate more alignment padding * re-enable nanoarrow integration test * add assertions that struct alignment is correct * remove struct alignment assertions * apply a patch to generated code rather than requiring patched flatc * point to google/flatbuffers with pub PushAlignment * add license header to gen.patch * use flatbuffers 24.12.23 * remove unnecessary gen.patch
…6955) * Add test and benchmarks for writing floats with NaNs * Remove extra benchmark with no NaNs
* add peek_next_page_offset * Update parquet/src/file/serialized_reader.rs Co-authored-by: Andrew Lamb <[email protected]> --------- Co-authored-by: Andrew Lamb <[email protected]>
* Improve `ParquetRecordBatchStreamBuilder` docs * Apply suggestions from code review Thank you @etseidl ❤️ Co-authored-by: Ed Seidl <[email protected]> * Update parquet/src/arrow/async_reader/mod.rs Co-authored-by: Ed Seidl <[email protected]> --------- Co-authored-by: Ed Seidl <[email protected]>
…he#6953) * Treat NaNs equal to NaN when interning for dictionary encoding * Compare all values by bytes rather than adding Intern trait
…to decimal3264
…to decimal3264
This change is rather large. It would in principle be possible to first submit a separate PR with a small number of refactoring changes before the PR that adds the new types. I think the context for the refactoring changes is useful, but would be willing to do the split if there's demand for it. |
I think it will be necessary to break this into smaller incremental pieces to get this in. Not just the refactoring, but also the functionality itself - the addition to DataType for example could be its own PR. I appreciate this is more effort on your end, but we're very review constrained, and a 3000 line diff is simply not tractable |
Which issue does this PR close?
Closes #6661.
Rationale for this change
Decimal32 and Decimal64 were added to Arrow recently; this implements support in arrow-rs.
What changes are included in this PR?
Code and tests for the new types are included.
Are there any user-facing changes?
New types
Decimal32Array
,Decimal64Array
,Decimal32Type
, andDecimal64Type
are added. New valuesDecimal32
andDecimal64
have been added to theDataType
enum. Consumers may need to update theirmatch
es accordingly.32-bit and 64-bit decimal values from Parquet files are still being returned as
Decimal128
by default unless the consumer specifically asks for the narrower type.