Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

treewide: format files generated by nuget-to-nix with nixfmt and make code generation of nuget-to-nix follow nixfmt rules #325053

Closed
wants to merge 2 commits into from

Conversation

TomaSajt
Copy link
Contributor

@TomaSajt TomaSajt commented Jul 6, 2024

Description of changes

This PR is made in anticipation of the upcoming treewide reformat of nixpkgs with the new nixfmt implementation.

The main change is the first commit, that makes code generated by nuget-to-nix be formatted how nixfmt would format it.

Please note, that the current logic is not completely correct, since code generation doesn't handle the case of an empty list correctly. I think this should be addressed before this gets merged.

The second commit is not strictly necessary, since, we're gonna get the total treewide formatting anyways, but I think it should be done to avoid bigger diffs when the files get regenerated next.

The second commit was created by running the following bash command:

find -name "*.nix" -not -name "default.nix" | xargs grep "This file was automatically generated by passthru.fetch-deps" -l | xargs nixfmt

(where the nixfmt is the one provided by the nixfmt-rfc-style package)

I am not sure if the second commit should go into .git-blame-ignore-revs.
I doubt we really care about git blaming these generated lines.


Note: there's another generator script in pkgs/development/compilers/dotnet/update.sh but I won't touch that in this PR.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added the 6.topic: dotnet Language: .NET label Jul 6, 2024
@TomaSajt TomaSajt force-pushed the nuget-to-nix-format branch from d86e4c4 to 8cbdfb2 Compare July 6, 2024 14:25
@TomaSajt TomaSajt marked this pull request as ready for review July 6, 2024 14:30
@TomaSajt TomaSajt removed request for mmahut and RaghavSood July 6, 2024 14:31
@TomaSajt TomaSajt force-pushed the nuget-to-nix-format branch from 8cbdfb2 to a763a6c Compare July 6, 2024 14:33
@TomaSajt TomaSajt force-pushed the nuget-to-nix-format branch 3 times, most recently from 2c35191 to f617165 Compare July 6, 2024 14:47
@corngood
Copy link
Contributor

corngood commented Jul 6, 2024

Have you tried running some fetch-deps to test that they match? Obviously that's a little tricky since it can update dependencies, but you could at least make sure nixfmt is a no-op afterwards.

Please note, that the current logic is not completely correct, since code generation doesn't handle the case of an empty list correctly. I think this should be addressed before this gets merged.

Is this something that nixfmt is always opinionated about? It seems like an annoying quirk to deal with for generated expressions.

@TomaSajt
Copy link
Contributor Author

TomaSajt commented Jul 6, 2024

Here's some small script to test regenerating some deps.nix files:

pkgs=(
    "famistudio"
    "galaxy-buds-client"
    "openutau"
    "tone"
    "btcpayserver"
    "nbxplorer"
    "wasabibackend"
    "denaro"
    "pinta"
    "ArchiSteamFarm"
    "avalonia-ilspy"
)

for pkg in "${pkgs[@]}"; do
    $(nix-build -A $pkg.fetch-deps)
done

At a first glance, the only change seem to be that recently sha256 was changed to hash as the default.
(And empty arrays are also not correct)
(And ArchiSteamFarm has some removed deps)

But the formatting looks correct everywhere

@corngood corngood mentioned this pull request Jul 31, 2024
20 tasks
Copy link
Member

@SuperSandro2000 SuperSandro2000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Genererated files are generally excempted from formatters and doing this here just blows up the repository by 40k lines. 👎🏼 from me

@corngood
Copy link
Contributor

Genererated files are generally excempted from formatters

Any idea how we make this happen, particularly so it doesn't complain about newly added files? see #331150 (comment)

@MattSturgeon
Copy link
Contributor

MattSturgeon commented Jul 31, 2024

Genererated files are generally excempted from formatters and doing this here just blows up the repository by 40k lines. 👎🏼 from me

RFC166 explicitly makes no decision on whether generated files should be formatted:

There are automatically generated files in Nixpkgs, with a potentially different format. This RFC makes no decisions on how to handle such cases, but there are some options:

  • Exclude them from the CI via some tooling (e.g. treefmt if that is being used)
  • Format them anyway, either after-the-fact or ideally already in the generator tooling itself

Personally, I like the simplicity of just having all nix files formatted. But I can understand why you may not want this.


Any idea how we make this happen, particularly so it doesn't complain about newly added files?

The RFC's implementation tracking issue currently makes no mention of generated files at all.

#322537 (treewide reformat) also does not address how generated files should be handled. I think it is currently assumed they will be formatted.

The CI added in #326407 currently enforces formatting for any new files and also any files that were previously formatted. This is skipped if the PR title contains [skip treewide], however that's not very useful here.

Perhaps this section could be extended to ignore files that contain some "This file is auto-generated" or "skip formatting" style header?

# Ignore files that weren't already formatted
if [[ -n "$source" ]] && ! nixfmt --check ${{ env.base }}/"$source" 2>/dev/null; then
echo "Ignoring file $file because it's not formatted in the base commit"
elif ! nixfmt --check "$dest"; then
unformattedFiles+=("$dest")
fi

cc @infinisil @NixOS/nix-formatting

@MattSturgeon
Copy link
Contributor

I brought this up with the formatting team and we discussed a few ways forward.

In the end the consensus seems to be that generated files should almost always use a format such as JSON or TOML, rather than generating nix code.

This quote from @emilazy nicely summarises the consensus:

you can represent anything that doesn't have bespoke logic in JSON, and by definition generated stuff usually has pretty rote logic. we have a dynamic language so there's not really anything we "need" to output actual code for

I will say, about bespoke formatting: sometimes the most important thing is avoiding nasty Git conflicts

so everything on one line of compact JSON isn't necessarily always best if two PRs are ever going to touch the same file

but you can do less expansion than human-readable source code for sure, like one package per line in JSON is often going to be reasonable

anyway, treating generated files the same as all the others both simplifies our rules/checking infrastructure, and discourages people from generating Nix, so personally I'm in favour of it

This should be a fairly simple change for dotnet. I believe we would have to update two places; the nuget-to-nix script and also the make-nuget-deps function.

Plus replacing any existing deps.nix files, of course. This could either be done in a treewide change, or gradually as passthru.fetch-deps is run in the affected packages. It's probably better to do a treewide switch to json/toml to avoid being affected by later treewide reformatting PRs?

make-nuget-deps could detect whether the sourceFile is nix or json/toml for some transition period. Nix sourceFile support could be deprecated and eventually be removed.

Presumably, even if all deps.nix are removed from nixpkgs, we'd still want to keep support for nix sourceFiles in make-nuget-deps in case out-of-tree projects are relying on the current behaviour.

We'd essentially be doing something like map fetchNuGet (lib.importJSON sourceFile).

@corngood am I missing anything important here?

@corngood
Copy link
Contributor

corngood commented Aug 6, 2024

I'm not necessarily against using JSON for this. I decided to use JSON when generating metadata for the VMR releases, see e.g. pkgs/development/compilers/dotnet/8/release-info.json.

However, I don't really understand the objection to generating nix. I also don't see why we wouldn't have format checking for JSON, or whatever other formats end up in the repo. In that case we'd be back in the same situation.

doing this here just blows up the repository by 40k lines

Is there a reason we'd care about the number of lines? What is it in absolute/% bytes? What would it be with JSON?

@MattSturgeon
Copy link
Contributor

I'm not necessarily against using JSON for this. I decided to use JSON when generating metadata for the VMR releases, see e.g. pkgs/development/compilers/dotnet/8/release-info.json.

However, I don't really understand the objection to generating nix.

I don't think there's a strong objection to generating nix, however it was noted that importing nix can be slower than importing JSON/TOML and that it is almost always unnecessary to generate nix, since a data-format such as JSON should be able to represent anything generated automatically.

I think the stronger argument is that because generated files can be JSON, there is less motivation to provide a way to avoid formatting generated nix files.

I also don't see why we wouldn't have format checking for JSON, or whatever other formats end up in the repo. In that case we'd be back in the same situation.

It's hard to say whether non-nix files such as JSON would ever be formatted in the future. Currently the view seems to be it's better to not format data files, since we can usually generate them in a bespoke way that minimizes diff-size and merge conflicts.

Quoting @emilazy:

yes, I wouldn't want us to enforce formatting of JSON

If you're able to join the nix formatting matrix room, the discussion started here, and mostly stayed on topic.

doing this here just blows up the repository by 40k lines

Is there a reason we'd care about the number of lines? What is it in absolute/% bytes? What would it be with JSON?

Personally I don't see this as an issue. Potential for merge conflict is far more important than line count. Absolute filesize should be similar with the current formatting vs with nixfmt, I'd expect it to be slightly higher.

When mentioned in the matrix discussion, the formatting team also didn't seem concerned about line count.


I see three potential paths forward:

  • Make the case again to the formatting team to add an "unformatted file" header comment directive
  • Format the generated files
  • Generate JSON/TOML files instead

As codeowner, your opinion probably matters most in deciding which path to take.

@corngood
Copy link
Contributor

corngood commented Aug 7, 2024

Make the case again to the formatting team to add an "unformatted file" header comment directive

This might be the issue to follow: NixOS/nixfmt#91

Format the generated files

This is my preference for now. When diffing the existing deps files, I almost always use --word-diff. Having them formatted should work better for diffs. It seems to be quite common in nixpkgs to generate or modify a file and then run nixfmt on it.

Generate JSON/TOML files instead

I have a couple of changes in mind that make me wary about doing this right now:

  • adding support for mirrors
  • adding platform (rid) metadata for filtering

There's also pkgs/development/compilers/dotnet/combine-deps.nix, which rewrites deps files, and would need to be converted.

@SuperSandro2000
Copy link
Member

Is there a reason we'd care about the number of lines? What is it in absolute/% bytes? What would it be with JSON?

Well, we increase the file size by 9% for nothing.

➜ ls -lah pkgs/applications/audio/galaxy-buds-client/deps.nix
.rw-r--r-- sandro users 33 KB Thu Jul 18 21:51:45 2024  pkgs/applications/audio/galaxy-buds-client/deps.nix

➜ nixfmt pkgs/applications/audio/galaxy-buds-client/deps.nix

➜ ls -lah pkgs/applications/audio/galaxy-buds-client/deps.nix
.rw-r--r-- sandro users 36 KB Sun Aug 18 18:14:45 2024  pkgs/applications/audio/galaxy-buds-client/deps.nix

Also #327064

I don't think there's a strong objection to generating nix, however it was noted that importing nix can be slower than importing JSON/TOML and that it is almost always unnecessary to generate nix, since a data-format such as JSON should be able to represent anything generated automatically.

That could be done and would probably be fine. I think if we do that, we should do it in one treewide commit to avoid having to create any compatibility code.

Potential for merge conflict is far more important than line count.

After any merge conflict the lock files need to be regenerated because it is not guaranteed that merging them by hand will yield the result you need and can only be proofed when building the package.

They should be reviewable though, so avoiding having all on one line is still important.

This is my preference for now. When diffing the existing deps files, I almost always use --word-diff. Having them formatted should work better for diffs. It seems to be quite common in nixpkgs to generate or modify a file and then run nixfmt on it.

We have a big problem with nixpkgs getting to big and making nix build .#hello quite slow. Every change to how lock files work should take this into account and try to achieve small lock files with only the required information.

@TomaSajt TomaSajt marked this pull request as draft August 18, 2024 20:17
@TomaSajt
Copy link
Contributor Author

TomaSajt commented Aug 18, 2024

Having seen the recent discussions of lockfile generation, I do think that this PR should not get merged. I'll leave it as draft until there's a better solution.

@Atemu
Copy link
Member

Atemu commented Nov 30, 2024

Well, we increase the file size by 9% for nothing.

➜ ls -lah pkgs/applications/audio/galaxy-buds-client/deps.nix
.rw-r--r-- sandro users 33 KB Thu Jul 18 21:51:45 2024  pkgs/applications/audio/galaxy-buds-client/deps.nix

➜ nixfmt pkgs/applications/audio/galaxy-buds-client/deps.nix

➜ ls -lah pkgs/applications/audio/galaxy-buds-client/deps.nix
.rw-r--r-- sandro users 36 KB Sun Aug 18 18:14:45 2024  pkgs/applications/audio/galaxy-buds-client/deps.nix

I was a bit confused why the file size would change at all but, now that I think about it, that might be due to the added newlines.

Your argument however is not sound however because anyone who cares even a little about file sizes will use compression where this increase in newline chars does not matter:

$ gzip -9 < pkgs/applications/audio/galaxy-buds-client/deps.nix | wc -c
10861
$ nixfmt pkgs/applications/audio/galaxy-buds-client/deps.nix
$ gzip -9 < pkgs/applications/audio/galaxy-buds-client/deps.nix | wc -c
10874

That's a whopping 13 bytes difference.

With btrfs' transparent filesystem compression at compress-force=zstd:1, there is no difference at all:

$ sudo compsize --bytes pkgs/applications/audio/galaxy-buds-client/deps.nix
Processed 1 file, 1 regular extents (1 refs), 0 inline.
Type       Perc     Disk Usage   Uncompressed Referenced  
TOTAL       33%     12288        36864        36864       
zstd        33%     12288        36864        36864       
$ nixfmt pkgs/applications/audio/galaxy-buds-client/deps.nix
$ sudo compsize --bytes pkgs/applications/audio/galaxy-buds-client/deps.nix
Processed 1 file, 1 regular extents (1 refs), 0 inline.
Type       Perc     Disk Usage   Uncompressed Referenced  
TOTAL       30%     12288        40960        40960       
zstd        30%     12288        40960        40960       

Git uses gzip on everything it stores and transmits AFAIK.

The only place where this would have an effect that is even measurable would be nixpkgs checkouts in filesystems that do not support transparent compression. Given that such users don't use filesystem compression that IME usually saves you 30-50% of total disk usage in a typical desktop system, I highly doubt they care about something on the order of a 9% increase in a subset of their nixpkgs checkout that amounts to maybe a dozen MiB total at worst.

We should definitely prioritise using JSON or TOML over Nix but we don't need to worry or care about file size increases due to formatting either.

OTOH I also question whether we need auto-generated Nix files to follow formatting rules at all since no human is supposed to read or modify them in the first place.

@MattSturgeon
Copy link
Contributor

OTOH I also question whether we need auto-generated Nix files to follow formatting rules at all since no human is supposed to read or modify them in the first place.

It's not that generated files need to be formatted, rather the issue is that allowing them to not be formatted is additional unnecessary complexity.

It's much simpler to say "all nix files must be formatted" rather than "all nix files must be formatted, unless..."

@GGG-KILLER
Copy link
Contributor

Please continue further discussion on how to move forward on #358025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants