Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Rust v0 symbol mangling scheme #474

Closed
cerisier opened this issue May 8, 2024 · 8 comments · Fixed by #491
Closed

Add support for Rust v0 symbol mangling scheme #474

cerisier opened this issue May 8, 2024 · 8 comments · Fixed by #491

Comments

@cerisier
Copy link
Contributor

cerisier commented May 8, 2024

Rust has defined its own symbol mangling scheme (https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html) which is currently not supported since it differs completely from the C++ name mangling scheme speedscope already support.

Would you accept a Pull Request to bring this support to Speedscope ?

@cerisier
Copy link
Contributor Author

cerisier commented Jan 5, 2025

@jlfwong any thoughts on this ?

If I understand correctly, the current C++ demangling support is a javascript (non WASM) emscripten compiled version of __cxa_demangle most probably from the GNU libiberty library.

I see 2 options for supporting Rust symbol demangling:

  1. Using the pure C implementation of the rust_demangle function also available in the GNU libiberty library.
  2. Using the official rustc_demangle crate (WASM will be required tho)

Would you be interesting in a PR using this implementation ?

I was also wondering what you would think of targeting WASM instead, as today the __cxa_demangle was compiled using the standalone Javascript target (without WASM) which is not only less performant but also much bigger in terms of size. Would you be open to WASM modules for this (with all due documentation and compilation process) ?

@cerisier
Copy link
Contributor Author

cerisier commented Jan 5, 2025

Another potential solution would be to use a single demangling function compiled into Javascript or WASM that would be using llvm::demangle (the same way perfetto those) and that handles both C++ and Rust the right way.

@jlfwong
Copy link
Owner

jlfwong commented Jan 5, 2025

Hey @cerisier, I'm open to a PR that implements this.

Between the solutions you're describing, the thing I'd like to do is whatever...

  1. Minimizes additional dependencies placed on speedscope (dealing with bitrot is a huge maintenance nightmare, and I like that the current C++ demangling has zero dependencies). For example, I'm not excited about the idea of adding crate or rust compilation as part of speedscope's build toolchain.
  2. Minimizes bundle size. We already lazy load the C++ demangling code to make sure it doesn't block initial load, but I'd like to keep total code size you need to download before you can actually demangle C++ code as low as possible.

Right now the C++ demangling code is ~160kB. I added that a long time ago, so I suspect I was choosing the path of least resistance there, and adding a minified JS ball was just really easy. If adding a wasm module is significantly smaller than that, and could be done without adding a huge dependency chain, I'd be open to it.

Can you help me understand what the tradeoffs between the approaches you're describing are in terms of resulting code?

@cerisier
Copy link
Contributor Author

cerisier commented Jan 6, 2025

Thanks for your answer !

Hey @cerisier, I'm open to a PR that implements this.

Between the solutions you're describing, the thing I'd like to do is whatever...

  1. Minimizes additional dependencies placed on speedscope (dealing with bitrot is a huge maintenance nightmare, and I like that the current C++ demangling has zero dependencies). For example, I'm not excited about the idea of adding crate or rust compilation as part of speedscope's build toolchain.

Understood. That said, even if the crate was the way to go, we'd probably ship a single compiled JS blob.
As for the 0 dependencies, I'll ensure whatever solution I suggest follows this mantra.

  1. Minimizes bundle size. We already lazy load the C++ demangling code to make sure it doesn't block initial load, but I'd like to keep total code size you need to download before you can actually demangle C++ code as low as possible.

Understood, I'll suggest an approach that keeps optimizes for bundle size and if possible, adds nothing to the initial load, keeping everything lazy loaded and async.

Right now the C++ demangling code is ~160kB. I added that a long time ago, so I suspect I was choosing the path of least resistance there, and adding a minified JS ball was just really easy. If adding a wasm module is significantly smaller than that, and could be done without adding a huge dependency chain, I'd be open to it.

I'll dig more into that and suggest the best of the 2 options with regards to the 2 important points above.

Out of curiosity, do you remember the process you used to get to that JS blob ?

  1. emscripten command line arguments ?
  2. extra Javascript calling the codegen?
  3. minifier command to bundle everything into a single blob?

That would help me figure out the best approach to fit with the existing.

Can you help me understand what the tradeoffs between the approaches you're describing are in terms of resulting code?

Sure,

I believe there are several options possible:

  1. Use the exact same library and technique you've used so far (and bridge the pure C rust_demangle function of the GNU libiberty library. And compile that into a pure Javascript file. This is the safest approach and would only result in an increase of size of the existing Javascript blob since it would contain another function. (For this tho, It'd be helpful to have the process you used to get to the first one).
  2. Same as 1. except using the WASM backend of emscripten. This would bring an additional .wasm file which would need to be loaded asynchronously anyway (so fits well with the existing lazy load approach) and I believe would result in close to 50% code size reduction compared to the current pure Javascript version. There would still be a small Javascript blob but much smaller and only containing glue code to bridge with WASM.

I think it's probably best to forget the crate option because it would diverge from how we could expose __cxa_demangle which is written either in C or C++ and would need to combine both emscripten and a rust wasm toolchain like wasm-pack.

I was also thinking of documenting the process along the way so that further improvements are easier to make.

@jlfwong
Copy link
Owner

jlfwong commented Jan 6, 2025

That approach sounds good to me! Re: how the JS ball was made, I don't remember, but a bit of code archeology from #2 suggests https://github.com/nattofriends/c-filtjs which may have the actual build commands used.

The URL referenced in a comment in the code is dead, but the internet archive link should work: https://web.archive.org/web/20200808121129/https://d.fuqu.jp/c++filtjs/

@cerisier
Copy link
Contributor Author

@jlfwong I just made a draft PR following our discussion.

Before I spend time making a proper PR description, I wanted to get your opinion on how/where to include the C source code as well as compilation instructions.

The approach I used compiles rust_demangle and cplus_demangle_v3 from gcc/libiberty into wasm.
I have a small demangle.c to add a bit of added logic to choose between one or the other following the same kind of logic as lllvm:demangle does.

What's your POV on this? A dedicated folder at the root of the project ? a demangling subfolder in the lib folder ?
Also since this is ad-hoc to the npm toolchain, one need to cd into that folder, run make and copy the output file in the lib folder.

@jlfwong
Copy link
Owner

jlfwong commented Jan 12, 2025

@cerisier I think keeping it as close spatially as possible to the resulting code appeals to me, so a demangle subfolder of lib appeals to me. Having all of the code related to demangling be in the same subtree makes sense to me, even if the code itself is multi-toolchain. Having the Makefile etc in there w/ a README.md explaining how to run it sounds good.

@cerisier
Copy link
Contributor Author

@jlfwong PR is ready 👌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants