Skip to content

[Packaging] Shrink wheel ~35 % via nvcc --compress-mode=size #1704

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

trmanish
Copy link

@trmanish trmanish commented Jul 10, 2025

What this PR does

  • Appends --compress-mode=size to CMAKE_CUDA_FLAGS for nvcc ≥ 12.4.
  • No runtime or API changes.

Impact

  • Wheel shrinks 69 MB → 45 MB (≈ 35 %).

Compatibility

  • nvcc < 12.4 builds unchanged—the flag is gated by a version check.
  • Decompression adds only a few hundred ms on first import bitsandbytes.

@matthewdouglas
Copy link
Member

Thanks, appreciate the suggestion! I have the same concern mentioned over on PyTorch regarding support for users with older drivers: pytorch/pytorch#157791 (comment)

Mainly it seems that this would require cu124+ users to have the 550+ driver, while currently we should still have compatibility for driver version 525+.

So will have to weigh that in as a consideration.

@trmanish
Copy link
Author

I believe an earlier comment from original PR did say it won't have that as a requirement

pytorch/pytorch#157791 (comment)

But I believe latest from Pytorch is as below:

pytorch/pytorch#157791 (comment)

However my understanding is(pls correct me if wrong) that the only variant that would be built with --compress-mode=size is the cu124 wheel, and that wheel already implies a 550-series driver. Users on 525/535 stay on the cu122 / cu121 wheels, which this PR leaves untouched.

Options
Merge as-is – compression only affects the cu124 wheel, no compatibility regression for existing users.

Opt-in flag – guard it behind ENABLE_BNB_CUDA_COMPRESSION=1; default off.

Dual wheels – publish both bitsandbytes-cu124.whl and -cu124-slim.whl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants