Skip to content

[comgr] Embed libc++ headers for HIPRTC standard C++ support#1287

Open
yxsamliu wants to merge 3 commits intoROCm:amd-stagingfrom
yxsamliu:hiprtc-std-headers-stg
Open

[comgr] Embed libc++ headers for HIPRTC standard C++ support#1287
yxsamliu wants to merge 3 commits intoROCm:amd-stagingfrom
yxsamliu:hiprtc-std-headers-stg

Conversation

@yxsamliu
Copy link

Add support for using standard C++ headers like <type_traits> in HIPRTC runtime-compiled kernels without requiring system C++ headers.

This enables HIPRTC users to access libc++ headers via virtual include paths (-I/comgr/include/c++/v1 -I/comgr/include), which is particularly useful for portable applications like Blender that may run on systems without GCC/MSVC development headers installed.

Implementation:

  • cmake/LibcxxHeaders.cmake: Embeds libc++ and Clang builtin headers using bc2h tool, generates libcxx_headers_impl.inc with lookup table
  • include/__config_site_hiprtc: Custom libc++ config for GPU freestanding environment (no threads, filesystem, localization, etc.)
  • src/comgr-libcxx-headers.{h,cpp}: API to populate VFS with embedded headers at runtime
  • src/comgr-compiler.cpp: Call populateLibcxxHeaders() during init

Usage:
const char* opts[] = {"-I/comgr/include/c++/v1", "-I/comgr/include"}; hiprtcCompileProgram(prog, 2, opts);

Size overhead: ~5MB added to libamd_comgr.so (206 header files embedded)

Controlled by cmake option COMGR_EMBED_LIBCXX_HEADERS (default ON).

Add support for using standard C++ headers like <type_traits> in HIPRTC
runtime-compiled kernels without requiring system C++ headers.

This enables HIPRTC users to access libc++ headers via virtual include
paths (-I/comgr/include/c++/v1 -I/comgr/include), which is particularly
useful for portable applications like Blender that may run on systems
without GCC/MSVC development headers installed.

Implementation:
- cmake/LibcxxHeaders.cmake: Embeds libc++ and Clang builtin headers
  using bc2h tool, generates libcxx_headers_impl.inc with lookup table
- include/__config_site_hiprtc: Custom libc++ config for GPU freestanding
  environment (no threads, filesystem, localization, etc.)
- src/comgr-libcxx-headers.{h,cpp}: API to populate VFS with embedded
  headers at runtime
- src/comgr-compiler.cpp: Call populateLibcxxHeaders() during init

Usage:
  const char* opts[] = {"-I/comgr/include/c++/v1", "-I/comgr/include"};
  hiprtcCompileProgram(prog, 2, opts);

Size overhead: ~5MB added to libamd_comgr.so (206 header files embedded)

Controlled by cmake option COMGR_EMBED_LIBCXX_HEADERS (default ON).
@z1-cciauto
Copy link
Collaborator

@github-actions
Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@chinmaydd
Copy link

chinmaydd commented Jan 30, 2026

Size overhead: ~5MB added to libamd_comgr.so (206 header files embedded)

Have we tested this on Windows ? Just curious.


I am kind-of afraid of the scope creep of including more headers if and when the requests come in. Let me know if I am understanding this wrong.


Also needs tests.


This seems like a change that is generally applicable elsewhere too. Probably worth an update to the documentation and release notes.

@yxsamliu
Copy link
Author

yxsamliu commented Feb 3, 2026

Have we tested this on Windows ? Just curious.

Yes, tested on Windows (gfx1200) today. Built comgr with MSVC 2022, resulting in a 141MB amd_comgr_3.dll. E2E test with <type_traits> passed — HIPRTC compilation with -nostdinc++ -I/comgr/include/c++/v1 -I/comgr/include succeeded and executed correctly. The VFS paths use forward slashes on both Windows and Linux, so users specify the same include paths on both platforms.

@yxsamliu
Copy link
Author

yxsamliu commented Feb 3, 2026

I am kind-of afraid of the scope creep of including more headers if and when the requests come in. Let me know if I am understanding this wrong.

I added other headers that we can support in a second commit. Also added debug output to measure memory usage of these headers at run time. Totally about 3.3MB, which should also be the comgr dll size increase. Previous 5MB was probably over estimation. I don't expect there will be further expansion after the second commit. I think 3.3MB increase in memory and disk usage worth the convenience brought about by the headers.

Also needs tests.

will add some tests to comgr.

This seems like a change that is generally applicable elsewhere too. Probably worth an update to the documentation and release notes.

will do.

@lamb-j lamb-j requested review from jmmartinez and slinder1 February 3, 2026 20:11
@lamb-j
Copy link
Collaborator

lamb-j commented Feb 3, 2026

< useful for portable applications like Blender that may run on systems without GCC/MSVC development headers installed

Aren't we pulling the headers from LLVM? Would we need GCC/MSVC installed?

I'm curious about the use case. Wouldn't most contexts with hipRTC available also have LLVM available? Why can't we ship the needed headers with LLVM instead of embedding in Comgr?

@yxsamliu yxsamliu requested a review from kzhuravl February 3, 2026 21:12
@yxsamliu
Copy link
Author

yxsamliu commented Feb 3, 2026

< useful for portable applications like Blender that may run on systems without GCC/MSVC development headers installed

Aren't we pulling the headers from LLVM? Would we need GCC/MSVC installed?

I'm curious about the use case. Wouldn't most contexts with hipRTC available also have LLVM available? Why can't we ship the needed headers with LLVM instead of embedding in Comgr?

we are embedding the libc++ headers in comgr dll's. These libc++ headers are part of LLVM, but we are considering the situation that the HIPRTC app is running on end users' systems, where there is no LLVM or Clang, only comgr dll, HIP runtime dll, and HIPRTC dll, so we have to get everything from comgr itself.

Add more header-only C++ standard library headers:
- tuple, optional, variant, utility, limits
- initializer_list, compare, concepts, ratio
- cmath, cstdlib, cstring, string_view

Key changes:
- Define _LIBCPP_REMOVE_TRANSITIVE_INCLUDES in __config_site_hiprtc
  to avoid pulling in exception/iostream dependencies
- Add __assertion_handler from vendor/llvm template
- Add verbose debug output using AMD_COMGR_EMIT_VERBOSE_LOGS
- Add unit test compile_hip_with_libcxx_test
- Update cmake comment for COMGR_EMBED_LIBCXX_HEADERS option
- Fix libcxx search path for standalone comgr builds
- Add documentation to README.md and ReleaseNotes.md
- Fix license headers to use Comgr style

Note: Use exception-free APIs in device code:
- std::get_if<T>() instead of std::get<T>() for variant
- value_or() instead of value() for optional
@z1-cciauto
Copy link
Collaborator

@chinmaydd
Copy link

chinmaydd commented Feb 4, 2026

Built comgr with MSVC 2022, resulting in a 141MB amd_comgr_3.dll

Thanks for checking that. Currently amd_comgr_3.dll is around 110MB. So this is a ~28% increase in the DLL size overall. I am generally okay with the PR, but I wonder if the size increase can be avoided via compress+embed of files. This would be a lot of work, and I would leave the final call with @lamb-j.

I believe this doesnt really matter with TheRock distributions, but it will when this change eventually makes it to the Driver. If we expect the primary users of this feature to be those using the Driver-provided DLL over the ROCm provided DLL, we should think about this.

EDIT: PSDB failure for comgr_compile_hip_with_libcxx_test seems legitimate.

Copy link

@jmmartinez jmmartinez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this.

@lamb-j
Copy link
Collaborator

lamb-j commented Feb 5, 2026

< useful for portable applications like Blender that may run on systems without GCC/MSVC development headers installed
Aren't we pulling the headers from LLVM? Would we need GCC/MSVC installed?
I'm curious about the use case. Wouldn't most contexts with hipRTC available also have LLVM available? Why can't we ship the needed headers with LLVM instead of embedding in Comgr?

we are embedding the libc++ headers in comgr dll's. These libc++ headers are part of LLVM, but we are considering the situation that the HIPRTC app is running on end users' systems, where there is no LLVM or Clang, only comgr dll, HIP runtime dll, and HIPRTC dll, so we have to get everything from comgr itself.

Ok thanks for the context. And there's not a mechanism for us to just copy the .h files directly along with hipRTC, and have users include them from the file system? I.e. somethign like "-I/hiprtc/include/c++/v1". We don't only have the DLLs, we have header files too right?

We'd lose the performance benefit of VFS, but would trend more toward the long-term Comgr goal of "behave like clang as much as possible".

Not trying to block this, just double checking some bases

@z1-cciauto
Copy link
Collaborator

@yxsamliu
Copy link
Author

yxsamliu commented Feb 5, 2026

Built comgr with MSVC 2022, resulting in a 141MB amd_comgr_3.dll

Thanks for checking that. Currently amd_comgr_3.dll is around 110MB. So this is a ~28% increase in the DLL size overall. I am generally okay with the PR, but I wonder if the size increase can be avoided via compress+embed of files. This would be a lot of work, and I would leave the final call with @lamb-j.

I believe this doesnt really matter with TheRock distributions, but it will when this change eventually makes it to the Driver. If we expect the primary users of this feature to be those using the Driver-provided DLL over the ROCm provided DLL, we should think about this.

EDIT: PSDB failure for comgr_compile_hip_with_libcxx_test seems legitimate.

Thanks for the feedback. I pushed a fix for the PSDB failure.

The root cause was that headers like optional and variant pull in cstring, which uses #include_next to find system C headers. That doesn't work in our VFS setup since we don't embed system C headers.

I've reduced the embedded set to only headers that work in a freestanding environment: type_traits, limits, tuple, utility, cstdint, cstddef, initializer_list, compare, and concepts. These cover the original request from Blender for type_traits support.

This also addresses the size concern. I rebuilt comgr on Windows with and without embedded headers using the same build configuration. The DLL size increase is about 1.5 MB (from 140.9 MB to 142.4 MB), which is roughly 1% overhead. I think the 110MB size you saw may be due to the different way of building. The delta between build with vs w/o the change might better reflect the overhead.

Remove headers that require system C headers via #include_next:
- optional, variant (depend on cstring for std::memcpy)
- ratio (depends on climits)
- cmath, cstdlib, cstring, string_view

Keep only headers that work in freestanding GPU environment:
- type_traits, limits, tuple, utility
- cstdint, cstddef, initializer_list
- compare, concepts

This reduces embedded headers from ~755 to 274 files, cutting
.inc file size from ~24MB to ~5MB.

Fixes PSDB test failure: 'stddef.h' file not found
@yxsamliu yxsamliu force-pushed the hiprtc-std-headers-stg branch from e62fcda to 7271165 Compare February 5, 2026 19:37
@z1-cciauto
Copy link
Collaborator

@chinmaydd
Copy link

This also addresses the size concern. I rebuilt comgr on Windows with and without embedded headers using the same build configuration. The DLL size increase is about 1.5 MB (from 140.9 MB to 142.4 MB), which is roughly 1% overhead. I think the 110MB size you saw may be due to the different way of building. The delta between build with vs w/o the change might better reflect the overhead.

Thanks for checking & fixing that Sam.

The PSDB still seems to be broken. Consider my soft approval until that passes.

@chinmaydd
Copy link

And there's not a mechanism for us to just copy the .h files directly along with hipRTC, and have users include them from the file system? I.e. somethign like "-I/hiprtc/include/c++/v1". We don't only have the DLLs, we have header files too right?

I believe the issue with this alternative is the driver-only installs of runtime+comgr on Windows. For instance, the client system might be devoid of standard library headers (such as those provided by GCC/MSVC/MinGW) entirely. Even if they are present, you will need them to be present at a predictable location or have the user provide the header include location through an env variable. This sounds cumbersome.

But I do generally agree that this is feature susceptible to scope creep. Requests may come along to include more headers with subsequent releases since the responsibility of providing (and supporting) them with comgr will shift to us.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants