Skip to content

Conversation

@aheejin
Copy link
Member

@aheejin aheejin commented Jan 8, 2026

This updates libcxx and libcxxabi from 20.1.8 to LLVM 21.1.8:
https://github.com/llvm/llvm-project/releases/tag/llvmorg-21.1.8

This also updates LLVM's libc header directories that libcxx is dependent on:

llvm-libc/hdr
llvm-libc/include/llvm-libc-macros
llvm-libc/include/llvm-libc-types
llvm-libc/shared
llvm-libc/config

Additional changes:

Our prevous `__assertion_handler` was copied from
`libcxx/vendor/llvm/default_assertion_handler.in`. This updates out
`__assertion_handler` to the new
`libcxx/vendor/llvm/default_assertion_handler.in`. For what this file
is, see the PR description of emscripten-core#22994, with which the file was added.
This is an automatic change generated by tools/maint/rebaseline_tests.py.

The following (12) test expectation files were updated by
running the tests with `--rebaseline`:

```
codesize/test_codesize_cxx_ctors1.json: 149109 => 153388 [+4279 bytes / +2.87%]
codesize/test_codesize_cxx_ctors2.json: 148513 => 152783 [+4270 bytes / +2.88%]
codesize/test_codesize_cxx_except.json: 194578 => 199006 [+4428 bytes / +2.28%]
codesize/test_codesize_cxx_except_wasm.json: 164093 => 168568 [+4475 bytes / +2.73%]
codesize/test_codesize_cxx_except_wasm_legacy.json: 161751 => 166220 [+4469 bytes / +2.76%]
codesize/test_codesize_cxx_lto.json: 125441 => 120672 [-4769 bytes / -3.80%]
codesize/test_codesize_cxx_mangle.json: 258655 => 263309 [+4654 bytes / +1.80%]
codesize/test_codesize_cxx_noexcept.json: 151526 => 155961 [+4435 bytes / +2.93%]
codesize/test_codesize_cxx_wasmfs.json: 177088 => 181644 [+4556 bytes / +2.57%]
codesize/test_codesize_files_wasmfs.json: 56055 => 56518 [+463 bytes / +0.83%]
codesize/test_minimal_runtime_code_size_hello_embind.json: 15129 => 15201 [+72 bytes / +0.48%]
codesize/test_minimal_runtime_code_size_hello_embind_val.json: 11858 => 11941 [+83 bytes / +0.70%]

Average change: +1.58% (-3.80% - +2.93%)
```
sbc100

This comment was marked as duplicate.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you mention in the PR title or description the version you are upgrading from (so it clear this is major update, not just a minor one).

As of LLVM 21 (llvm/llvm-project#122790),
libcxx's `std::basic_string`'s constructor is marked as `_Nonnull`:
https://github.com/llvm/llvm-project/blob/2078da43e25a4623cab2d0d60decddf709aaea28/libcxx/include/string#L1061
https://github.com/llvm/llvm-project/blob/2078da43e25a4623cab2d0d60decddf709aaea28/libcxx/include/__config#L1252

Comiling this with LLVM 21's libcxx errors out with:
```console
../emscripten/cache/ports/regal/regal-version_7/src/boost/boost/print/string_list.hpp:281:46: error:
      null passed to a callee that requires a non-null argument
      [-Werror,-Wnonnull]
  281 |   PushBack(*this).assign(string ? string : T(NULL));
      |                                              ^~~~
../emscripten/cache/sysroot/include/unistd.h:33:14: note:
      expanded from macro 'NULL'
   33 | #define NULL 0L
```

This suppresses the warnings.
These files use `exit`, which requires `#include <stdlib.h>`, but didn't
have the include. These didn't error out because `libcxx/include/math.h`
used to contain `#include <stdlib.h>` but not anymore. It was removed in
llvm/llvm-project#139586.
This uses `exit`, which requires `#include <stdlib.h>`, but didn't have
the include. This didn't error out because `libcxx/include/math.h` used
to contain `#include <stdlib.h>` but not anymore. It was removed in
llvm/llvm-project#139586.
libcxx used to have `module.modulemap` but it was changed to
`module.modulemap.in` in
llvm/llvm-project#134699. This recreates
`module.modulemap` from that `module.modulemap.in`. Without this
`other.test_cxx20_modules_std_headers` fails.
@sbc100 sbc100 changed the title Update libcxx and libcxxabi to LLVM 21.1.8 Update libcxx and libcxxabi from LLVM 20 to 21 Jan 11, 2026
@aheejin
Copy link
Member Author

aheejin commented Jan 11, 2026

Can you mention in the PR title or description the version you are upgrading from (so it clear this is major update, not just a minor one).

I updated the description.

This increase seems to be mainly due to the changes of implementaion of
`std::num_put` in `system/lib/libcxx/include/__locale_dir/num.h`. This
now calls functions, namely `__to_chars_integral` and its helpers.
@aheejin aheejin changed the title Update libcxx and libcxxabi from LLVM 20 to 21 Update libcxx and libcxxabi from LLVM 20.1.8 to 21.1.8 Jan 13, 2026
@aheejin aheejin marked this pull request as ready for review January 13, 2026 06:58
@sbc100
Copy link
Collaborator

sbc100 commented Jan 14, 2026

I guess the only blocker for this change now is the codesize changes?

Could you rebase on top of the latest llvm 23 changes?

This is an automatic change generated by tools/maint/rebaseline_tests.py.

The following (1) test expectation files were updated by
running the tests with `--rebaseline`:

```
codesize/test_codesize_cxx_lto.json: 125435 => 120666 [-4769 bytes / -3.80%]

Average change: -3.80% (-3.80% - -3.80%)
```
@aheejin
Copy link
Member Author

aheejin commented Jan 14, 2026

I guess the only blocker for this change now is the codesize changes?

Could you rebase on top of the latest llvm 23 changes?

Done

@sbc100
Copy link
Collaborator

sbc100 commented Jan 14, 2026

Remaining issues:

  1. Code size changes: Should we investigate more before accepting them?
  2. llvm-libc changes: Do we actually want a policy change here? Could we instead so a update of llvm-libc to the main branch, which might make any change to llvm-libc here unnecessary?

Since this change is quite big I wonder if its worth splitting out the test/fetch changes and landing those first?

@aheejin
Copy link
Member Author

aheejin commented Jan 14, 2026

Remaining issues:

  1. Code size changes: Should we investigate more before accepting them?

I looked into them, and while there is no single reason behind the increase, the most influential thing is the one I described in test_malloc_size* increase commit (09bc21e), that a class in libcxx is refactored and now it pulls in more functions. There doesn't seem to be a preprocessor switch that we can turn it on or off (there is no 'off' anyway). It doesn't even account for all increases; I guess around half is just various other changes in the implementation.

  1. llvm-libc changes: Do we actually want a policy change here? Could we instead so a update of llvm-libc to the main branch, which might make any change to llvm-libc here unnecessary?

This PR only updates several directories in llvm-libc that libcxx depend on, and the updated files are newer than the current code. I think we can update them to tot eventually, but I don't see any downside in landing this first.

Also, I'd suggest we discuss more on whether we actually need to follow tot. It's harder to maintain and extract our changes.

Since this change is quite big I wonder if its worth splitting out the test/fetch changes and landing those first?

#26107

@sbc100
Copy link
Collaborator

sbc100 commented Jan 14, 2026

@kripken @juj WDYT this libc++ update? It comes with some minor codesize regressions of a few of Kb which for small programs could be a couple of percent I think. This time around there doesn't seem to be any one culprit here, just the many changes to upstream from the last 6 months. I'm tempted to just land this as is, since I don't think we want to be in the business of carrying major downstream patches (or even having to understand the inner workings of libc++ itself). I also don't think we want to be lagging in our libc++ version.

@aheejin
Copy link
Member Author

aheejin commented Jan 15, 2026

For test_codesize_cxx_noexcept, these are the list of functions with their sizes, sorted in decreasing order, respectively built with the main branch vs. this branch. As you can see, many things have changed, and it's hard to pick a single culprit.
main.funcs.sorted.txt
lib21.funcs.sorted.txt

@kripken
Copy link
Member

kripken commented Jan 15, 2026

I agree with you two, the regressions seem spread out and not worth investigating in detail (unfortunately).

This is an automatic change generated by tools/maint/rebaseline_tests.py.

The following (1) test expectation files were updated by
running the tests with `--rebaseline`:

```
codesize/test_codesize_cxx_lto.json: 125443 => 120674 [-4769 bytes / -3.80%]

Average change: -3.80% (-3.80% - -3.80%)
```
@aheejin
Copy link
Member Author

aheejin commented Jan 15, 2026

Thanks! Can anyone LGTM this?

@juj
Copy link
Collaborator

juj commented Jan 15, 2026

@kripken @juj WDYT this libc++ update? It comes with some minor codesize regressions of a few of Kb which for small programs could be a couple of percent I think. This time around there doesn't seem to be any one culprit here, just the many changes to upstream from the last 6 months. I'm tempted to just land this as is, since I don't think we want to be in the business of carrying major downstream patches (or even having to understand the inner workings of libc++ itself). I also don't think we want to be lagging in our libc++ version.

Thanks for the heads-up.

I agree the codesize regression is a bit annoying, but given that this is from a third-party library, it would be far more annoying to not go with the flow. So LGTM to land to keep things simple.

It is great to have the code size tests that we can see regressions like this, and their impact. Thanks @aheejin for also investigating in that direction.

Given that most of Emscripten system headers do not require/depend on libc++, the regression is not unconditional to all users. So if users are avoiding libc++, the issue does not occur. (The only exception here being wasmfs, sadly)

Anyhow, LGTM to land. Actually, had two minor questions..

@@ -110,6 +110,9 @@ def create(final):
'-Wno-unused-parameter',
'-Wno-nontrivial-memaccess',
'-fdelayed-template-parsing',
# src/boost/boost/print/string_list.hpp calls std::string(NULL), whose
# consturctor is declared _Nonnull
'-Wno-nonnull',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know what happens in practice when calling std::string(NULL) in the new libc++? That won't cause an immediate crash?

(consturctor -> constructor)

Copy link
Member Author

@aheejin aheejin Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec-wise, it's UB.
In practice, it depends on the hardening mode.

Hardening mode was added in LLVM 19, and it provides additional checks for UB. More you check you pay more in runtime performance. There are four modes:

// The library provides the macro `_LIBCPP_HARDENING_MODE` which can be set to one of the following values:
//
// - `_LIBCPP_HARDENING_MODE_NONE`;
// - `_LIBCPP_HARDENING_MODE_FAST`;
// - `_LIBCPP_HARDENING_MODE_EXTENSIVE`;
// - `_LIBCPP_HARDENING_MODE_DEBUG`.
//
// These values have the following effects:
//
// - `_LIBCPP_HARDENING_MODE_NONE` -- sets the hardening mode to "none" which disables all runtime hardening checks;
//
// - `_LIBCPP_HARDENING_MODE_FAST` -- sets that hardening mode to "fast". The fast mode enables security-critical checks
// that can be done with relatively little runtime overhead in constant time;
//
// - `_LIBCPP_HARDENING_MODE_EXTENSIVE` -- sets the hardening mode to "extensive". The extensive mode is a superset of
// the fast mode that additionally enables checks that are relatively cheap and prevent common types of logic errors
// but are not necessarily security-critical;
//
// - `_LIBCPP_HARDENING_MODE_DEBUG` -- sets the hardening mode to "debug". The debug mode is a superset of the extensive
// mode and enables all checks available in the library, including internal assertions. Checks that are part of the
// debug mode can be very expensive and thus the debug mode is intended to be used for testing, not in production.

We disabled it when we updated to LLVM 19 not to pay runtime performance cost:

// Hardening
#define _LIBCPP_HARDENING_MODE_DEFAULT _LIBCPP_HARDENING_MODE_NONE

std::basic_string has this when the input is null:

_LIBCPP_ASSERT_NON_NULL(__s != nullptr, "basic_string(const char*) detected nullptr");

But _LIBCPP_ASSERT_NON_NULL is enabled only when the hardening mode is _LIBCPP_HARDENING_MODE_EXTENSIVE or _LIBCPP_HARDENING_MODE_DEBUG. Because we currently use _LIBCPP_hardening_MODE_NONE, it does nothing:

# define _LIBCPP_ASSERT_NON_NULL(expression, message) ((void)0)

So for us, it won't immediately crash and the program can malfunction.


It'd be better if we can avoid this. I considered emscripten-ports/regal#10, but @sbc100 suggested we probably would't want to maintain the port.

So this suppresses the warning. This at least doesn't make things worse, because that code was already there for years.

'emmalloc_O2': (['-sMALLOC=emmalloc', '-O2'], 125000),
'dlmalloc_O2': (['-sMALLOC=dlmalloc', '-O2'], 132000),
'mimalloc_O2': (['-sMALLOC=mimalloc', '-O2'], 180000),
'emmalloc_O2': (['-sMALLOC=emmalloc', '-O2'], 130000),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, emmalloc does not utilize libc++ - do you know why this change in size?

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2026

LGTM once we figure out why the malloc benchmark changed.

Copy link
Collaborator

@sbc100 sbc100 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wait, the answer is pretty obvious test_malloc_size uses hello_libcxx.cpp to measure codesize which includes libc++

@sbc100
Copy link
Collaborator

sbc100 commented Jan 15, 2026

Feel free to land this. Once we do we should probably cut a new release.

@sbc100
Copy link
Collaborator

sbc100 commented Jan 16, 2026

Can we land this now?

@aheejin aheejin merged commit 48ee51e into emscripten-core:main Jan 16, 2026
36 checks passed
@aheejin aheejin deleted the update_libcxx_libcxxabi_21 branch January 16, 2026 21:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants