Skip to content

Conversation

@Vishwanatha-HD
Copy link

@Vishwanatha-HD Vishwanatha-HD commented Nov 21, 2025

Rationale for this change

This PR is intended to enable Parquet DB support on Big-endian (s390x) systems. The fix in this PR fixes the Statistics logic.

What changes are included in this PR?

The fix includes changes to following file:
cpp/src/parquet/statistics.cc

Are these changes tested?

Yes. The changes are tested on s390x arch to make sure things are working fine. The fix is also tested on x86 arch, to make sure there is no new regression introduced.

Are there any user-facing changes?

No

GitHub main Issue link: #48151

@github-actions
Copy link

⚠️ GitHub issue #48206 has been automatically assigned in GitHub to PR creator.

Comment on lines +964 to +969
// Fallback: use encoder for other types
auto encoder = MakeTypedEncoder<DType>(Encoding::PLAIN, false, descr_, pool_);
encoder->Put(&src, 1);
auto buffer = encoder->FlushValues();
dst->assign(reinterpret_cast<const char*>(buffer->data()),
static_cast<size_t>(buffer->size()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse the implementation in ARROW_LITTLE_ENDIAN for this?

#if !ARROW_LITTLE_ENDIAN
  if constexprt (...) {
    ...
  } else if ... {
    ...
  }
#endif
  auto encoder = MakeTypedEncoder<DType>(Encoding::PLAIN, false, descr_, pool_);
  encoder->Put(&src, 1);
  auto buffer = encoder->FlushValues();
  auto ptr = reinterpret_cast<const char*>(buffer->data());
  dst->assign(ptr, static_cast<size_t>(buffer->size()));

Comment on lines +937 to +963
if constexpr (std::is_same_v<DType, Int32Type>) {
uint32_t u;
std::memcpy(&u, &src, sizeof(u));
u = ::arrow::bit_util::ToLittleEndian(u);
dst->assign(reinterpret_cast<const char*>(&u), sizeof(u));
return;
} else if constexpr (std::is_same_v<DType, Int64Type>) {
uint64_t u;
std::memcpy(&u, &src, sizeof(u));
u = ::arrow::bit_util::ToLittleEndian(u);
dst->assign(reinterpret_cast<const char*>(&u), sizeof(u));
return;
} else if constexpr (std::is_same_v<DType, FloatType>) {
uint32_t u;
static_assert(sizeof(u) == sizeof(float), "size");
std::memcpy(&u, &src, sizeof(u));
u = ::arrow::bit_util::ToLittleEndian(u);
dst->assign(reinterpret_cast<const char*>(&u), sizeof(u));
return;
} else if constexpr (std::is_same_v<DType, DoubleType>) {
uint64_t u;
static_assert(sizeof(u) == sizeof(double), "size");
std::memcpy(&u, &src, sizeof(u));
u = ::arrow::bit_util::ToLittleEndian(u);
dst->assign(reinterpret_cast<const char*>(&u), sizeof(u));
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this in XXXEncoder::Put() instead of here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants