Skip to content

Conversation

pitrou
Copy link
Member

@pitrou pitrou commented Sep 24, 2025

Thanks for opening a pull request!

If this is your first pull request you can find detailed information on how to contribute here:

Please remove this line and the above text before creating your pull request.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

This PR includes breaking changes to public APIs. (If there are any breaking changes to public APIs, please explain which changes are breaking. If not, you can remove this.)

This PR contains a "Critical Fix". (If the changes fix either (a) a security vulnerability, (b) a bug that caused incorrect or invalid data to be produced, or (c) a bug that causes a crash (even when the API contract is upheld), please provide explanation. If not, you can remove this.)

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

@ursabot please benchmark

@voltrondatabot
Copy link

Benchmark runs are scheduled for commit 41f441a. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

@github-actions github-actions bot added the awaiting review Awaiting review label Sep 24, 2025
@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

@AntoinePrv This is trying to run the benchmarks with IPO to see if it makes a significant difference.

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

@ursabot please benchmark lang=C++

@voltrondatabot
Copy link

Benchmark runs are scheduled for commit bf2698a. Watch https://buildkite.com/apache-arrow and https://conbench.ursa.dev for updates. A comment will be posted here when the runs are complete.

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

Note: a caveat of the conbench benchmarks results is that they all use the conda-forge toolchain, including the same compiler and binutils versions (gcc 14.3.0).

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

I'm trying this locally, also using conda-forge and therefore the same toolchain. One interesting aspect is that IPO seems to decrease total library size, perhaps by removing unused private functions and data:

  • before:
$ size -G /build/build-release/relwithdebinfo/*.so
      text       data        bss      total filename
   1231288     436476       1080    1668844 /build/build-release/relwithdebinfo/libarrow_acero.so
  11719000    2076782      46232   13842014 /build/build-release/relwithdebinfo/libarrow_compute.so
   1029085     613682       3496    1646263 /build/build-release/relwithdebinfo/libarrow_dataset.so
  11368776    3895932    2181849   17446557 /build/build-release/relwithdebinfo/libarrow.so
   1010359     500136       1992    1512487 /build/build-release/relwithdebinfo/libarrow_testing.so
   2675915    1510609       6328    4192852 /build/build-release/relwithdebinfo/libparquet.so
  • after:
$ size -G /build/build-release/relwithdebinfo/*.so
      text       data        bss      total filename
    971243     380629       1552    1353424 /build/build-release/relwithdebinfo/libarrow_acero.so
  11764472    1924230      46440   13735142 /build/build-release/relwithdebinfo/libarrow_compute.so
    911885     500747       4328    1416960 /build/build-release/relwithdebinfo/libarrow_dataset.so
  10306362    3473709    2183392   15963463 /build/build-release/relwithdebinfo/libarrow.so
    954794     439139       3968    1397901 /build/build-release/relwithdebinfo/libarrow_testing.so
   2446829    1344558       4128    3795515 /build/build-release/relwithdebinfo/libparquet.so

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

I ran two sets of benchmark locally and the results are roughly similar to those on the conbench machines:

Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool. Even if the runtime benchmarks are the same, this seems pretty easy to implement and results in good library size savings

@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Sep 24, 2025
@WillAyd
Copy link
Contributor

WillAyd commented Sep 24, 2025

I see that LLVM also has a "thin" LTO type that may offer improvements over the default LTO

https://clang.llvm.org/docs/ThinLTO.html
https://blog.llvm.org/2016/06/thinlto-scalable-and-incremental-lto.html

Do we know if this enabled by CMake through this option, or would it be an additional option to benchmark?

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

I have no idea, but the benchmarks here use gcc.

@pitrou
Copy link
Member Author

pitrou commented Sep 24, 2025

Also, I was surprised that, at least with gcc 14.3.0, build times do not seem to increase significantly with CMAKE_INTERPROCEDURAL_OPTIMIZATION enabled. So perhaps gcc is already using some equivalent of clang's "thin LTO".

Copy link

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit 41f441a.

There were 31 benchmark results indicating a performance regression:

The full Conbench report has more details.

Copy link

Thanks for your patience. Conbench analyzed the 3 benchmarking runs that have been run so far on PR commit bf2698a.

There were 33 benchmark results indicating a performance regression:

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting committer review Awaiting committer review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants