[SYCL][Clang] Add support for device image compression #15124

uditagarwal97 · 2024-08-18T17:06:44Z

This PR adds support for device image compression for the old offloading model. I'll make another follow-up PR to extend support for the new offload model.

Design summary:

ZSTD (compression algo)   ----> LLVMSupport (Interface)  ------> clang-offload-wrapper (For compression)
 |
 ----------------------------------------------------- --------> SYCL RT (For decompression)

This PR introduces ZSTD (https://github.com/facebook/zstd) as a 3rd party dependency of DPCPP. Similar to upstream LLVM, we expect user to have zstd-dev package installed on their machine - we won't be installing zstd from sources.

How to use
To compress device images, add --offload-compress CLI option to your clang invocation. Note that we compress device images only if the size of device images exceeds a threshold, which is 512 bytes by default. Moreover, by default, we use ZSTD level 10 for compression. ZSTD compression levels provides a tradeoff between (de)compression time and compression ratio, and the compression level can be changed using --offload-compression-level=<int> CLI option.

uditagarwal97 · 2024-08-21T18:07:15Z

Some initial performance stats:

Dataset: https://github.com/aras-p/smol-v/tree/master/tests/spirv-dumps
Dataset size: 275 SPIR-V files

Conclusion:
Overall, for SPIR-V files < 50KB, the decompression time is below 0.1ms, compression time <0.15ms, and compression ratio is ~3 (compressed image is 1/3 the original size).
For very small images (<512 bytes), I don't see much benefit of image compression.

Note:- Most of the SPIR-V files I have in the dataset are <50KB. I'm working on extending the performance evaluation to larger workloads. Also, the (de)compression performance will vary with the format of the file being compressed, so for AOT, where device images consists of target assembly, the performance stats might differ.

jbrodman · 2024-08-21T19:09:33Z

What happens with the PTX and AMDGPU targets? Are they covered by the "native" binary image format? Do we need additional formats?

jbrodman · 2024-08-21T19:15:48Z

Also guessing this feature may not make sense when combined with the native cpu device, but need to think more about that.

uditagarwal97 · 2024-08-21T23:23:28Z

What happens with the PTX and AMDGPU targets? Are they covered by the "native" binary image format? Do we need additional formats?

I think they are covered by the "none" binary image format. This is because clang driver (in SYCL offload mode) never specifies the image format in call to clang-offload-wrapper. So, by default, the BinaryImageFormat is "none" and it is upto the SYCL runtime to determine the format (https://github.com/intel/llvm/blob/sycl/sycl/source/detail/device_binary_image.cpp#L170).

I tested my changes with PTX, and they seem to work fine, so, we'd likely not require additional formats.

mdtoguchi

OK for driver

uditagarwal97 · 2024-09-24T21:08:23Z

@bso-intel @intel/llvm-reviewers-runtime ping!

bso-intel · 2024-09-25T16:33:29Z

clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp

+          llvm::compression::zstd::compress(
+              ArrayRef<unsigned char>(
+                  (const unsigned char *)(Bin->getBufferStart()),
+                  Bin->getBufferSize()),
+              CompressedBuffer, OffloadCompressLevel);


When the crash occurs when compression failed, the error message that the SYCL end-user will receive may not be useful.
For example, "Failed to create ZSTD_CCtx", ""Failed to set ZSTD_c_compressionLevel", etc.
It would be much better user experience if they get a message like "Device image compression failed." + e.what().

bso-intel · 2024-09-25T16:35:25Z

clang/tools/clang-offload-wrapper/ClangOffloadWrapper.cpp

+              "'--offload-compress' option is specified but zstd "
+              "is not available. The device image will not be "
+              "compressed.");


This kind of error message is good since the SYCL end-user can understand what went wrong.

uditagarwal97 · 2024-09-25T20:22:38Z

@bso-intel

When the crash occurs when compression failed, the error message that the SYCL end-user will receive may not be useful.
For example, "Failed to create ZSTD_CCtx", ""Failed to set ZSTD_c_compressionLevel", etc.
It would be much better user experience if they get a message like "Device image compression failed." + e.what().

In 946a738, I've wrapped zstd::compress in try/catch to throw a more meaningful error message. Note that this will only work if DPC++ is built with LLVM_ENABLE_EH.

sycl/source/detail/program_manager/program_manager.cpp

uditagarwal97 · 2024-10-22T21:09:14Z

@intel/llvm-gatekeepers The PR is ready to be merged. All the downstream infrastructure, along with intel/llvm CI machines, is ready with zstd installed.

againull · 2024-10-22T22:49:25Z

@uditagarwal97 Could you please take a look at post-commit failures: https://github.com/intel/llvm/actions/runs/11468699849/job/31915329901

Failed Tests (2):
SYCL :: Compression/compression.cpp
SYCL :: Compression/compression_multiple_tu.cpp

On AMD/HIP

jsji · 2024-10-23T12:53:36Z

buildbot/configure.py

@@ -178,6 +178,8 @@ def do_configure(args):
        "-DLLVM_ENABLE_PROJECTS={}".format(llvm_enable_projects),
        "-DSYCL_BUILD_PI_HIP_PLATFORM={}".format(sycl_build_pi_hip_platform),
        "-DLLVM_BUILD_TOOLS=ON",
+        "-DLLVM_ENABLE_ZSTD=ON",


I think we shouldn't turn on these by default? We should just turn on them in if args.ci_defaults:

The idea behind turning them on by default is to build the compiler with device image compression support if the user has zstd-dev package installed. If zstd is not found, there shouldn't be any build error.

there are configurate failures.

CMake Error at lib/Support/CMakeLists.txt:327 (get_property):

get_property could not find TARGET zstd::libzstd_static. Perhaps it has

not yet been created.

CMake Error at lib/Support/CMakeLists.txt:330 (get_property):

get_property could not find TARGET zstd::libzstd_static. Perhaps it has

not yet been created.

yeah local build is cooked for me too, we need to revert the on by zstd default part at least

PR to temporarily turn off zstd by default: #15833

uditagarwal97 · 2024-10-23T14:20:05Z

@uditagarwal97 Could you please take a look at post-commit failures: https://github.com/intel/llvm/actions/runs/11468699849/job/31915329901

Failed Tests (2): SYCL :: Compression/compression.cpp SYCL :: Compression/compression_multiple_tu.cpp

On AMD/HIP

PR to disable the failing tests on HIP: #15830

including a backport of device image compression: intel/llvm#15124 that can be enabled by adding `--offload-compress` compiler option.

uditagarwal97 added 4 commits August 9, 2024 18:22

Add sycl-compress

ef323f7

Fix decompression in RT

bdab2f0

Cleanup

45f1e99

Fix ZSTD Cmake dependencies

34978f8

uditagarwal97 self-assigned this Aug 18, 2024

Merge branch 'sycl' into compress_img

195e961

uditagarwal97 had a problem deploying to WindowsCILock August 18, 2024 17:08 — with GitHub Actions Failure

Remove unwanted formatting changes

cd64225

uditagarwal97 had a problem deploying to WindowsCILock August 18, 2024 18:02 — with GitHub Actions Error

More cleanup

d89f41b

uditagarwal97 had a problem deploying to WindowsCILock August 18, 2024 18:14 — with GitHub Actions Failure

Add option in clang driver to trigger compression.

fb643e3

uditagarwal97 had a problem deploying to WindowsCILock August 18, 2024 22:16 — with GitHub Actions Failure

Cleanup + build fix

151e70a

uditagarwal97 had a problem deploying to WindowsCILock August 19, 2024 06:42 — with GitHub Actions Failure

Fix ZSTD build on windows, RHEL

2983fab

uditagarwal97 had a problem deploying to WindowsCILock August 19, 2024 15:08 — with GitHub Actions Failure

uditagarwal97 added 2 commits August 19, 2024 09:18

Merge remote-tracking branch 'upstream/sycl' into compress_img

054984c

Fix clang warnings and formatting

4493984

uditagarwal97 had a problem deploying to WindowsCILock August 19, 2024 16:21 — with GitHub Actions Failure

Try fixing Windows build

dbb96a7

uditagarwal97 had a problem deploying to WindowsCILock August 20, 2024 05:54 — with GitHub Actions Failure

uditagarwal97 added 2 commits August 25, 2024 21:57

Merge remote-tracking branch 'upstream/sycl' into compress_img

6c26a42

Fix linkage error while windows build

7d7edc6

uditagarwal97 had a problem deploying to WindowsCILock August 26, 2024 06:14 — with GitHub Actions Failure

Fix include_directory for sycl-compress

f0aca25

uditagarwal97 temporarily deployed to WindowsCILock September 18, 2024 23:45 — with GitHub Actions Inactive

mdtoguchi approved these changes Sep 18, 2024

View reviewed changes

uditagarwal97 had a problem deploying to WindowsCILock September 19, 2024 00:22 — with GitHub Actions Failure

Delay image decompression till it is actually used.

c1a2c13

uditagarwal97 temporarily deployed to WindowsCILock September 24, 2024 02:54 — with GitHub Actions Inactive

uditagarwal97 had a problem deploying to WindowsCILock September 24, 2024 03:33 — with GitHub Actions Failure

Fix E2E test failure in compression_multiple_tu

966e3dd

uditagarwal97 temporarily deployed to WindowsCILock September 24, 2024 06:16 — with GitHub Actions Inactive

uditagarwal97 had a problem deploying to WindowsCILock September 24, 2024 07:10 — with GitHub Actions Failure

bso-intel requested changes Sep 25, 2024

View reviewed changes

cperkinsintel approved these changes Sep 25, 2024

View reviewed changes

Address reviews

946a738

uditagarwal97 temporarily deployed to WindowsCILock September 25, 2024 20:19 — with GitHub Actions Inactive

uditagarwal97 requested a review from bso-intel September 25, 2024 20:24

uditagarwal97 temporarily deployed to WindowsCILock September 25, 2024 20:58 — with GitHub Actions Inactive

bso-intel approved these changes Sep 27, 2024

View reviewed changes

uditagarwal97 commented Oct 4, 2024

View reviewed changes

sycl/source/detail/program_manager/program_manager.cpp Outdated Show resolved Hide resolved

Fix coverity and build issue

ce1d0f0

uditagarwal97 temporarily deployed to WindowsCILock October 22, 2024 17:45 — with GitHub Actions Inactive

uditagarwal97 temporarily deployed to WindowsCILock October 22, 2024 19:16 — with GitHub Actions Inactive

againull merged commit 155fe2a into intel:sycl Oct 22, 2024
13 checks passed

jsji reviewed Oct 23, 2024

View reviewed changes

uditagarwal97 mentioned this pull request Jan 24, 2025

[SYCL] Link and include LLVMSupport in SYCL library #16763

Merged

hubot pushed a commit to blender/blender that referenced this pull request Feb 3, 2025

Build: upgrade DPC++/Level-Zero to 6.0.0-rc1/1.19.2 releases

bdb093f

including a backport of device image compression: intel/llvm#15124 that can be enabled by adding `--offload-compress` compiler option.

uditagarwal97 mentioned this pull request Apr 8, 2025

[SYCL][Clang] Fix compilation with --offload-compress and missing zstd #17914

Closed

[SYCL][Clang] Add support for device image compression #15124

[SYCL][Clang] Add support for device image compression #15124

Uh oh!

Conversation

uditagarwal97 commented Aug 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uditagarwal97 commented Aug 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbrodman commented Aug 21, 2024

Uh oh!

jbrodman commented Aug 21, 2024

Uh oh!

uditagarwal97 commented Aug 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdtoguchi left a comment

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 commented Sep 24, 2024

Uh oh!

bso-intel Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

bso-intel Sep 25, 2024

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 commented Sep 25, 2024

Uh oh!

Uh oh!

uditagarwal97 commented Oct 22, 2024

Uh oh!

Uh oh!

againull commented Oct 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsji Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

jsji Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

jsji Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

sarnex Oct 23, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 Oct 23, 2024

Choose a reason for hiding this comment

Uh oh!

uditagarwal97 commented Oct 23, 2024

Uh oh!

Uh oh!

uditagarwal97 commented Aug 18, 2024 •

edited

Loading

uditagarwal97 commented Aug 21, 2024 •

edited

Loading

uditagarwal97 commented Aug 21, 2024 •

edited

Loading

againull commented Oct 22, 2024 •

edited

Loading

sarnex Oct 23, 2024 •

edited

Loading