-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][Statistics] How to distinct "utf8" and "binary" in arrow::ArrayStatistics
?
#44579
Comments
kou
added a commit
to kou/arrow
that referenced
this issue
Nov 6, 2024
kou
added a commit
to kou/arrow
that referenced
this issue
Nov 6, 2024
Why would we want to distinguish them? |
I wanted this when I create an |
kou
added a commit
to kou/arrow
that referenced
this issue
Dec 21, 2024
kou
added a commit
that referenced
this issue
Dec 31, 2024
…pe (#45094) ### Rationale for this change `arrow::ArrayStatistics` uses raw C++ types such as `int64_t` and `std::string` for min/max types. We need to convert raw C++ types to Arrow types when we use `arrow::ArrayStatistics` for generating statistics array. (GH-45038) We can't map `std::string` to an Arrow type. Because it may be `arrow::binary()`, `arrow::utf8()` or something. ### What changes are included in this PR? Use `arrow::DataType` information of associated array when we convert `arrow::ArrayStatistics`'s min/max raw C++ types to Arrow types. ### Are these changes tested? Yes. ### Are there any user-facing changes? No. * GitHub Issue: #44579 Authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Sutou Kouhei <[email protected]>
Issue resolved by pull request 45094 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
We can represent "utf8" and "binary" min/max values by
arrow::ArrayStatistics
:arrow/cpp/src/arrow/array/statistics.h
Lines 29 to 67 in 4c36f12
But we can't distinct them because we use
std::string
inValueType
for both of them.How can we distinct them? Should we add
arrow::ArrayStatistics::{min,max}_type
?Component(s)
C++
The text was updated successfully, but these errors were encountered: