Skip to content

Conversation

@aheejin
Copy link
Member

@aheejin aheejin commented Oct 17, 2025

This changes the format of the user input "paths" file so that multiple paths can be split into a single module. The new paths file structure is now very similar to wasm-split's manifest file, with functions replaced with paths. For example, the format will be like

module1
path/to/a
path/to/b

module2
path/to/c

Where module1 and module2 are module names.

This changes the format of the user input "paths" file so that multiple
paths can be split into a single module. The new paths file structure is
now very similar to wasm-split's manifest file, with functions replaced
with paths. For example, the format will be like
```
module1
path/to/a
path/to/b

module2
path/to/c
```
Where `module1` and `module2` are module names.
@aheejin aheejin requested a review from dschuff October 17, 2025 21:10

# Write .manifest file
with tempfile.NamedTemporaryFile(suffix=".manifest", mode='w+', delete=args.preserve_manifest) as f:
with tempfile.NamedTemporaryFile(suffix=".manifest", mode='w+', delete=not args.preserve_manifest) as f:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by fix. --preserve-manifest was doing the opposite of what it should do...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently this drive-by fix is seems to be a reason that Windows CI is failing: https://app.circleci.com/pipelines/github/emscripten-core/emscripten/46843/workflows/78cfd5c9-b267-45b7-ac41-6669c0b2adb6/jobs/1058317

It looks in Windows it is not allowed to re-open an already opened file unless some conditions are met: https://docs.python.org/3/library/tempfile.html

Opening the temporary file again by its name while it is still open works as follows:

  • On POSIX the file can always be opened again.
  • On Windows, make sure that at least one of the following conditions are fulfilled:
    • delete is false
    • additional open shares delete access (e.g. by calling os.open() with the flag O_TEMPORARY)
    • delete is true but delete_on_close is false. Note, that in this case the additional opens that do not share delete access (e.g. created via builtin open()) must be closed before exiting the context manager, else the os.unlink() call on context manager exit will fail with a PermissionError.

The Windows CI didn't error out before this PR because the first condition (delete is false) was accidentally satisfied due to this bug. To fix this I think we should use try-finally. Will do this in another PR, and will revert this drive-by fix here.

@@ -15760,9 +15760,14 @@ def test_empath_split(self):
void foo() { std::cout << "foo" << std::endl; }
''')
create_file('path_list', r'''
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this path_list.txt ?

Copy link
Member Author

@aheejin aheejin Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: ac1d245

f.write(func + '\n')
if i < len(paths) - 1:
if i < len(module_to_paths) - 1:
f.write('\n')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead do this at the top with:

if i != 0:  // Unless we are the first entry add a newline separator
   f.write('\n')

Copy link
Member Author

@aheejin aheejin Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done: 97258c6

for func in path_to_funcs[path]:

f.write(f'{module}\n')
for func in funcs:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annoyingly If you want this to be deterministic then I think you need to make this a dict rather than set.

The dict in python became deterministic in 3.7 but the set remains unordered. Annoying..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it and also one more place from set to list: 7eedf6d
Apparently they didn't even need to be sets after all...

@aheejin
Copy link
Member Author

aheejin commented Oct 23, 2025

I don't understand why test_sanity fails.. https://app.circleci.com/pipelines/github/emscripten-core/emscripten/46879/workflows/0ae5c3bb-a690-4d5f-aeaf-a43705d8484e/jobs/1059459

@sbc100 Do you think other things look fine?

''')
create_file('path_list', r'''
create_file('path_list.txt', r'''
myapp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you end these lines with : maybe? I feel like it would make them easier to read if I could clearly see the different between module names and filenames.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to make the file format similar to that of wasm-split's multi split manifest, which does not have colons: https://github.com/WebAssembly/binaryen/blob/main/test/lit/wasm-split/multi-split.wast.manifest

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, maybe we could update them both one day?

Copy link
Member Author

@aheejin aheejin Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind much, but wasm-split's manifest file already has some users. If we decide to do it, I think we can give some internal clients a heads-up though. WDYT? @tlively

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will land this PR for now then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind much, but wasm-split's manifest file already has some users. If we decide to do it, I think we can give some internal clients a heads-up though. WDYT? @tlively

Making improvements to the manifest file format sounds good to me. I imagine we might want to make more changes in the future as well, for example to include information about dependencies between modules.

@aheejin aheejin merged commit 1fab6e2 into emscripten-core:main Oct 23, 2025
34 checks passed
@aheejin aheejin deleted the empath_split_multi branch October 23, 2025 18:46
aheejin added a commit to aheejin/binaryen that referenced this pull request Oct 27, 2025
Suggested by by @sbc100
(emscripten-core/emscripten#25577 (comment)).

Currently this supports both module names with a `:` and without it for
backwards compatibility and also to pass the CI.
aheejin added a commit to aheejin/emscripten that referenced this pull request Oct 27, 2025
This makes better distinction of module names and function lists.
Suggested by by @sbc100
(emscripten-core#25577 (comment)).

This has to land after
WebAssembly/binaryen#8003.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants