postprocess: add `--exclude-file` for skipping functions #1541

kkysen · 2026-01-08T22:42:13Z

As a general-purpose solution to bypassing functions the postprocessor's comment transferer (and future transforms, too) can't handle yet, this adds an exclude list to skip functions in files. There is also --ident-filter, which is a single regex just based on the function name, but this is more specific and also takes into account the file name. The file passed to --exclude-file is a YAML file containing file paths (relative to the exclude file's directory) and the function names within them to skip. This gets us around issues like conditionally-compiled comments (#1534), the first function in a file also including license comments and #includes, and anything else. I've run this on json-c so far

(cd c2rust-postprocess/ && uv run postprocess $OLDPWD/tests/integration/tests/json-c/repo/lib.rs --no-update-rust --exclude-file $OLDPWD/tests/integration/tests/json-c/postprocess-exclude.yml)

and checked-in the LLM cache for it, but setting up the testing infrastructure to run this automatically is still a WIP (should be the next PR). Funnily enough, the LLM cache is going to end up behaving similarly to snapshot tests once it's up and running.

thedataking

I'm not sure we should drop the functionality provided by --ident-filter. I would like to retain the ability to filter on the command line without having to provide a map file. I like to just translate a single function over and over for development; I'm too lazy to write a file for that.

Please make it such that the internal exclude list can be built from a file or a command line argument? I would just like to pass a flag that lets me restrict postprocessing to a given identifier (or, ideally, set of related identifiers which is why I used a regex).

kkysen

I'm not sure we should drop the functionality provided by --ident-filter. I would like to retain the ability to filter on the command line without having to provide a map file. I like to just translate a single function over and over for development; I'm too lazy to write a file for that.

Okay, I can add that back. Is it important that it's still a regex?

Please make it such that the internal exclude list can be built from a file or a command line argument? I would just like to pass a flag that lets me restrict postprocessing to a given identifier (or, ideally, set of related identifiers which is why I used a regex).

Is this also needed if there's also --ident-filter?

thedataking · 2026-01-08T23:43:51Z

Okay, I can add that back. Is it important that it's still a regex?

It is nice to be able to filter on a glob pattern since related C functions share a prefix or suffix. Not sure we need a full regex. I think you should consider supporting glob patterns in the exclude files however since many related C files follow a pattern.... I'm assuming it is easy to do and useful but I'm open to being told otherwise.

Is this also needed if there's also --ident-filter?

No, I didn't mean to specify implementation. I shouldn't have phrased it that way. I would like something similar to what we had and I wasn't sure if you'd just wanna drop the commit so we preserve the old functionality or merge it with the new somehow. Please do what you think makes sense and we can continue the discussion from there.

kkysen

@thedataking, I added back --ident-filter now. Seems like the simplest solution rather than trying to combine them.

kkysen · 2026-01-09T07:16:21Z

@fw-immunant, --emit-c-decl-map is failing on zstd and python2 in slice_decl_with_fixups.

thedataking

LGTM overall 🚀 Minor stuff:

exclude_list.py: using assert for data validation isn't encouraged. They can be disabled with the optimize flag. Moreover, incorrect input data aren't truly exceptional events. Can you raise ValueError, TypeError, or KeyError as appropriate to communicate bad exclude file contents?
postprocess-exclude.yml: How do we capture why an identifier is skipped? I know we have the postprocessor issue where some comments are guarded but there could also be other reasons to skip a function. Along the same lines, how do I capture that we should skip an identifier for transform X but apply transform Y? (I'm fine landing this without support for multiple transforms on the assumption that it is something we can easily add later.)
Could you document the exclude file syntax in the README? Doesn't have to be fancy or perfect; could be a simple example plus some limitations (can I skip a whole file, can't use globs to skip a group of identifiers.).

Again, looks great, most important is to not use asserts for data validation IMO. Also happy that this functionality surfaced problems with --emit-c-decl-map which we need to get fixed.

kkysen

exclude_list.py: using assert for data validation isn't encouraged. They can be disabled with the optimize flag. Moreover, incorrect input data aren't truly exceptional events. Can you raise ValueError, TypeError, or KeyError as appropriate to communicate bad exclude file contents?

Oh, whoops. I didn't realize Python was like that and you could disable asserts. I'll fix that, but we also use asserts in a few other places that need to be checked, too.

postprocess-exclude.yml: How do we capture why an identifier is skipped? I know we have the postprocessor issue where some comments are guarded but there could also be other reasons to skip a function.

We don't do anything for why an identifier is skipped for now, but that is something we could probably add.

Along the same lines, how do I capture that we should skip an identifier for transform X but apply transform Y?

We might want to just have a different set of excludes for every transform. I'm not sure how interrelated they'll end up being, so we should probably wait until we have other transforms to decide how exactly we manage that. Shouldn't be too difficult whichever way we decide to handle that.

(I'm fine landing this without support for multiple transforms on the assumption that it is something we can easily add later.)

Yup, we should be able to add it later.

Could you document the exclude file syntax in the README? Doesn't have to be fancy or perfect; could be a simple example plus some limitations (can I skip a whole file, can't use globs to skip a group of identifiers.).

👍

kkysen · 2026-01-12T22:39:20Z

@thedataking, I fixed the asserts introduced here, but not all of them in the codebase (we probably should, though, if we expect someone may turn off assertions, as they're used similarly elsewhere as well).

I added the README explanations as well.

I'll do the exclude file extensions later once we know more about what exactly we want for that.

Should I try to exclude --emit-c-decl-map for the projects it fails on, or wait until that's fixed first (simpler but would have to wait longer)?

thedataking · 2026-01-13T03:41:41Z

Everything you wrote here #1541 (comment) sounds good. Can you unpack:

Should I try to exclude --emit-c-decl-map for the projects it fails on, or wait until that's fixed first (simpler but would have to wait longer)?

Couldn't we wrap up this PR and then add exclude files for additional projects in a follow up PR? If yes, I'd like to do that.

kkysen

Everything you wrote here #1541 (comment) sounds good. Can you unpack:

Should I try to exclude --emit-c-decl-map for the projects it fails on, or wait until that's fixed first (simpler but would have to wait longer)?

Couldn't we wrap up this PR and then add exclude files for additional projects in a follow up PR? If yes, I'd like to do that.

I added --emit-c-decl-map in templates.py in this PR so that after running the tests, you can run c2rust-postprocess manually on it. That's what's breaking in CI right now. I could

Don't include --emit-c-decl-map in this PR, so to test yourself, you'd first have to add it there.
Add support to conditionally add --emit-c-decl-map and only enable it on the projects that work at the moment.
Fix the --emit-c-decl-map bug.

kkysen · 2026-01-13T22:04:53Z

I removed 5946368 for now until #1545 is fixed so I can merge it now.

`--ident-filter` takes a single regex, while `--exclude-file` is a YAML file containing file paths and identifiers of functions to skip. This is more precise and scalable. We can use this as a general solution to skipping functions we can't handle yet so that we can scale up testing first. `--ident-filter` is still kept since it's easy to use one-off, while `--exclude-file` is better for more comprehensive tests.

Also, check-in the LLM cache for running `c2rust-postprocess` on `json-c`. This doesn't yet set up automated testing in CI, but we're now able to run `c2rust-postprocess` on `json-c` manually after running the existing `tests/integration` tests and adding `--emit-c-decl-map` to `TRANSPILE_SH` in `templates.py`.

…e `assert`s can be disabled `assert`s can be disabled in Python at runtime, so replace them with actual `if` checks raising errors. This does so for `isinstance` checks.

…ughly in the README

kkysen

I removed 5946368 for now until #1545 is fixed so I can merge it now.

#1545 is fixed now by #1546, so nevermind, but 5946368 is included in #1546.

kkysen requested a review from thedataking January 8, 2026 22:42

thedataking requested changes Jan 8, 2026

View reviewed changes

kkysen commented Jan 8, 2026

View reviewed changes

kkysen force-pushed the kkysen/postprocess-exclude branch 2 times, most recently from 5674ba2 to dc319fd Compare January 9, 2026 06:47

kkysen requested a review from thedataking January 9, 2026 06:47

kkysen commented Jan 9, 2026

View reviewed changes

thedataking requested changes Jan 10, 2026

View reviewed changes

kkysen commented Jan 12, 2026

View reviewed changes

kkysen requested a review from thedataking January 12, 2026 22:35

kkysen force-pushed the kkysen/postprocess-exclude branch from 67be1e4 to 0638a26 Compare January 12, 2026 22:37

kkysen mentioned this pull request Jan 12, 2026

--emit-c-decl-map fails on zstd and python2 in slice_decl_with_fixups #1545

Closed

thedataking approved these changes Jan 13, 2026

View reviewed changes

kkysen commented Jan 13, 2026

View reviewed changes

kkysen force-pushed the kkysen/postprocess-exclude branch 2 times, most recently from 94af50b to 64b51fd Compare January 13, 2026 22:03

kkysen added 7 commits January 14, 2026 12:58

postprocess: update qsort.rs after recent transpiler changes

0220f79

postprocess: skip empty comment lines in comparing comments

a4a7b71

postprocess: log the cache value as toml (easier to read)

223aa9a

postprocess: replace assert isinstance with check_isinstance sinc…

6df0e4b

…e `assert`s can be disabled `assert`s can be disabled in Python at runtime, so replace them with actual `if` checks raising errors. This does so for `isinstance` checks.

postprocess: explain --ident-filter and --exclude-file more thoro…

53c18d1

…ughly in the README

kkysen force-pushed the kkysen/postprocess-exclude branch from 64b51fd to 53c18d1 Compare January 14, 2026 21:03

kkysen commented Jan 14, 2026

View reviewed changes

kkysen merged commit b648f56 into master Jan 14, 2026
11 checks passed

kkysen deleted the kkysen/postprocess-exclude branch January 14, 2026 21:36

postprocess: add --exclude-file for skipping functions #1541

postprocess: add --exclude-file for skipping functions #1541

Uh oh!

Conversation

kkysen commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thedataking left a comment

Choose a reason for hiding this comment

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

thedataking commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

kkysen commented Jan 9, 2026

Uh oh!

thedataking left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

kkysen commented Jan 12, 2026

Uh oh!

thedataking commented Jan 13, 2026

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

kkysen commented Jan 13, 2026

Uh oh!

kkysen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

postprocess: add `--exclude-file` for skipping functions #1541

postprocess: add `--exclude-file` for skipping functions #1541

kkysen commented Jan 8, 2026 •

edited

Loading

thedataking commented Jan 8, 2026 •

edited

Loading

thedataking left a comment •

edited

Loading