Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify definitions of siginfo_t, statvfs and statfs in musl targets #3261

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

bossmc
Copy link
Contributor

@bossmc bossmc commented Jun 1, 2023

During #2935 I noticed there were multiple identical definitions of some of the bits/***.h types in the musl target, as well as a few places where a type was defined twice in the module tree (leading to the "upper-most" definition being the one exported by the library, in contradiction to the expectation that the "most-specific" definition would be used.

This change moves the definitions of struct statvfs(64) and struct siginfo_t to be global for all musl targets (see https://git.musl-libc.org/cgit/musl/tree/include/sys/statvfs.h and https://git.musl-libc.org/cgit/musl/tree/include/signal.h which are architecture-agnostic headers) and struct statfs(64) to be global for all 64-bit musl targets (see https://git.musl-libc.org/cgit/musl/tree/arch/generic/bits/statfs.h). This also required moving fsblkcnt64_t and fsfilcnt64_t to be musl-wide too (for use in struct statvfs64).

It also removes a bunch of redundant (and unreachable) definitions in the riscv64 and riscv32 targets (there seems to be a riscv32 target in the crate, but not in musl itself or at least there's no arch/riscv32 folder in tree).

Upshot of the above is that this change has no externally visible effect, if the more specific types were intended to be used they weren't being so removing them is a no-op. To actually use more specific type definitions one would need to cfg out the general definition as well as providing the specific one.

To find most of these issues I used this process
$ for target in $(rustc --print target-list | grep musl)
do
  echo $target
  RUSTDOCFLAGS="--output-format json --document-private-items" cargo +nightly doc -Z build-std=core --target $target
done
$ for json in target/**/doc/libc.json
do
  echo $json
  jq '.index[] | select(.inner | keys[0] == "struct") | .name' $json | sort | uniq -d
done

The first command uses rustdoc to create a JSON representation of the API of the crate for each (musl) target and the second searches that output for two exported structs of the same name within a single target. Where there's a duplicate, only one of the two symbols is actually usable (and due to import rules, symbols defined locally take precedence over symbols imported from submodules so the less specific symbol is the one that wins).

You can do similar tests for enum, typedef, union, constant by changing the second command in the obvious way, you can also do the same for function though you need to additionally filter on extern "C" (since e.g. there's many many clone functions defined in the crate):

$ jq '.index[] | select(.inner | keys[0] == "function") | select(.inner.function.header.abi | (type == "object" and keys[0] == "C"))
| .name' $json | sort | uniq -d

It feels like adding the checks in that methodology to CI for each target would be a good way to catch issues where a more specific definition is masked by a less-specific one.

@rustbot
Copy link
Collaborator

rustbot commented Jun 1, 2023

r? @JohnTitor

(rustbot has picked a reviewer for you, use r? to override)

@JohnTitor
Copy link
Member

Thanks! @bors r+

@bors
Copy link
Contributor

bors commented Jun 4, 2023

📌 Commit ec384c7 has been approved by JohnTitor

It is now in the queue for this repository.

@bors
Copy link
Contributor

bors commented Jun 4, 2023

⌛ Testing commit ec384c7 with merge 99927dc...

bors added a commit that referenced this pull request Jun 4, 2023
Unify definitions of `siginfo_t`, `statvfs` and `statfs` in `musl` targets

During #2935 I noticed there were multiple identical definitions of some of the `bits/***.h` types in the  musl target, as well as a few places where a type was defined twice in the module tree (leading to the "upper-most" definition being the one exported by the library, in contradiction to the expectation that the "most-specific" definition would be used.

This change moves the definitions of `struct statvfs(64)` and `struct siginfo_t` to be global for all `musl` targets (see https://git.musl-libc.org/cgit/musl/tree/include/sys/statvfs.h and https://git.musl-libc.org/cgit/musl/tree/include/signal.h which are architecture-agnostic headers) and `struct statfs(64)` to be global for all 64-bit `musl` targets (see https://git.musl-libc.org/cgit/musl/tree/arch/generic/bits/statfs.h).  This also required moving `fsblkcnt64_t` and `fsfilcnt64_t` to be musl-wide too (for use in `struct statvfs64`).

It also removes a bunch of redundant (and unreachable) definitions in the `riscv64` and `riscv32` targets (there seems to be a `riscv32` target in the crate, but not in `musl` itself or at least there's no `arch/riscv32` folder in tree).

Upshot of the above is that this change has no externally visible effect, if the more specific types were intended to be used they weren't being so removing them is a no-op.  To actually use more specific type definitions one would need to `cfg` out the general definition as well as providing the specific one.

<details>

<summary>To find most of these issues I used this process</summary>

```
$ for target in $(rustc --print target-list | grep musl)
do
  echo $target
  RUSTDOCFLAGS="--output-format json --document-private-items" cargo +nightly doc -Z build-std=core --target $target
done
$ for json in target/**/doc/libc.json
do
  echo $json
  jq '.index[] | select(.inner | keys[0] == "struct") | .name' $json | sort | uniq -d
done
```

The first command uses rustdoc to create a JSON representation of the API of the crate for each (`musl`) target and the second searches that output for two exported structs of the same name within a single target.  Where there's a duplicate, only one of the two symbols is actually usable (and due to import rules, symbols defined locally take precedence over symbols imported from submodules so the less specific symbol is the one that wins).

You can do similar tests for `enum`, `typedef`, `union`, constant` by changing the second command in the obvious way, you can also do the same for `function` though you need to additionally filter on `extern "C"` (since e.g. there's many many `clone` functions defined in the crate):

```
$ jq '.index[] | select(.inner | keys[0] == "function") | select(.inner.function.header.abi | (type == "object" and keys[0] == "C"))
| .name' $json | sort | uniq -d
```

</details>

It feels like adding the checks in that methodology to CI for each target would be a good way to catch issues where a more specific definition is masked by a less-specific one.
@bors
Copy link
Contributor

bors commented Jun 4, 2023

💔 Test failed - checks-actions

@bossmc
Copy link
Contributor Author

bossmc commented Jun 6, 2023

Turns out I was both too aggressive and not aggressive enough - every target uses the same statfs definition apart from the mips targets (that re-arranges the fields) so I've lifted statfs to by musl-wide. OTOH mips also re-orders the fields in siginfo_t so I've corrected the (global) definition to get that right, I considered adding a mips-specific definition but that seemed overkill.

@bossmc
Copy link
Contributor Author

bossmc commented Jul 18, 2023

This MR is in danger of going stale so I'd like to resolve the style issue. However I can't find a way to write the code that meets all the conflicting requirements/intentions:

  1. the style check only allows one s! per file (so I can't have a #[cfg(not(mips))] s! { struct statvfs ... } alongside the common s!)
  2. The s! macro can't have a cfg_if nested inside it (it possibly could but it's quite tricky to make work)
  3. If you use a #[cfg(not(mips))] inside the s! macro it removes the struct definition but keeps the impl Copy/impl Clone which leads to dual-definitions when the real version is added.
  4. Moving the impl Copy and impl Clone blocks to a #[derive(Copy, Clone)] combined with (3) works, but the auto-derived Clone implementation doesn't use the fact that it's Copy to optimize itself to a memcpy in debug mode

I'm trying to avoid having 10 identical copies of this type defined simply because mips needs a special copy (I'm also baffled why mips has so many mips-only layout differences compared to other targets when it's a fairly recent addition to the linux architectures but that ship has long sailed)

@bors
Copy link
Contributor

bors commented Oct 27, 2023

☔ The latest upstream changes (presumably #3302) made this pull request unmergeable. Please resolve the merge conflicts.

@tgross35
Copy link
Contributor

Hey @bossmc, sorry this has been stale for so long. If you are interested in rebasing so we can merge this than that would be excellent; if not, no worries.

I do need to note that I don't think this is backportable, meaning we will only get the changes once 1.0 is released (which may still be a few months off). I have to assume this isn't a problem since most of this is refactoring, but just an FYI.

This MR is in danger of going stale so I'd like to resolve the style issue. However I can't find a way to write the code that meets all the conflicting requirements/intentions:

1. the style check only allows one `s!` per file (so I can't have a `#[cfg(not(mips))] s! { struct statvfs ... }` alongside the common `s!`)

2. The `s!` macro can't have a `cfg_if` nested inside it (it possibly could but it's quite tricky to make work)

3. If you use a `#[cfg(not(mips))]` inside the `s!` macro it removes the struct definition but keeps the `impl Copy`/`impl Clone` which leads to dual-definitions when the real version is added.

4. Moving the `impl Copy` and `impl Clone` blocks to a `#[derive(Copy, Clone)]` combined with (3) works, but the auto-derived `Clone` implementation doesn't use the fact that it's `Copy` to optimize itself to a `memcpy` in debug mode

I'm trying to avoid having 10 identical copies of this type defined simply because mips needs a special copy (I'm also baffled why mips has so many mips-only layout differences compared to other targets when it's a fairly recent addition to the linux architectures but that ship has long sailed)

Your 3+4 is totally fine here - how well things optimize in debug mode is essentially a nonissue. For cases where it is, it more common for those crates to just compile dependencies in release mode, so nothing for us to worry about here.

@tgross35
Copy link
Contributor

Let me know if you need anything else. Just @rustbot review when you are ready.

@rustbot author

@bossmc
Copy link
Contributor Author

bossmc commented Oct 10, 2024

Thanks! I've made that change and rebased to (what was) main.

@rustbot review

src/macros.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@tgross35 tgross35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple small nits and this lgtm, please rebase and squash.

Regarding your top post - I am entirely in favor of using rustdoc JSON output for testing. If you are up to it, please do consider putting something like this in the libc-test crate.

@bossmc
Copy link
Contributor Author

bossmc commented Nov 15, 2024

Not a problem, I've spun out #4038 to separate the two changes.

Once that's in, I'll rebase this change for final review.

@tgross35
Copy link
Contributor

tgross35 commented Nov 17, 2024

Hey, feel free to hold off on rebasing for another day or two. I'm trying to get some larger refactoring in and sync up the branches a bit more before merging anything else (but thanks for being on top of things)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants