Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow globbing for named constituents #1425

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Ma27
Copy link
Member

@Ma27 Ma27 commented Nov 28, 2024

Note: I'm well-aware of the ongoing effort to switch to nix-eval-jobs. However, when this was implemented, I decided to stick with the Hydra version from nixpkgs 24.05 because we wanted to roll it out internally.
I'm filing the current state to see if there's any interest in accepting these patches. If that's the case, I'm happy to port the relevant changes to nix-eval-jobs and add some more documentation on it.


This basically allows you to do constituents = [ "nixos.*" ]; to select all jobs starting with nixos. as constituents. Additionally, it fixes a bug in Hydra where transitive constituent jobs may be discarded silently if declared by string instead of by derivation.

The commit messages go pretty much into detail what happened here, so I suggest to read those.

cc @ctheune I hope it's OK I also included your patch to log memory usage per job, I figured it's also useful even though it probably needs to be ported to nix-eval-jobs.
cc @Ericson2314 @Mic92 for input.

Ma27 and others added 3 commits November 28, 2024 10:59
Using named constituents, i.e.

    contituents = [
      "nixos.channel"
      "nixos.tests.foo"
    ]

is way faster than using derivations since the process evaluating the
aggregate doesn't need to evaluate the constituents again, only the
aggregate job.

In the Flying Circus platform, we had code before that added most of the
other jobs as constituents to the aggregate[1].

When we decided to switch to named constituents, there was a question on
what to do about this: recursively evaluating attribute names may be
costly since it's not clear at which level to stop and if you recurse
one level too deep to check if you can stop, you may have a lot of
evaluation work again.

The other idea was to just use globs, i.e.

    constituents = [
      "nixos.*"
    ]

to select all jobs starting with `nixos.`.

The behavior is basically if a constituent name X doesn't exist in the
jobset, try to match it against all job names. Since this is costly,
derivations must opt in explicitly with `_hydraGlobConstituents =
true;`.

This uses libc's `fnmatch(3)` internally, so all semantics from it apply
here.

We learned that the code in here already has issues with transitivity,
i.e. if you have the jobs A, B, C where B & C are aggregates with

    { name = "B"; constituents = [ "A" ]; }

and

    { name = "C"; constituents = [ "B" ]; }

you may end up with C only depending on the B without the derivation
rewrites (i.e. with `B` only having strings instead of derivations as
constituents).

With globs it may be way easier to run into this, so one of the next
patches will fix this.

[1] https://github.com/flyingcircusio/fc-nixos/blob/b7475201f067404d972ee640ffb3fe50a1989862/release/default.nix#L311
Given the leaf jobs `packages.foo` & `packages.bar`, an aggregate job
`aggregate0` with

    _hydraGlobConstituents = true;
    constituents = [ "packages.*" ];

and an aggregate job `aggregate1` with

    constituents = [ "aggregate0" ];

then it may happen depending on the order of evaluation that `aggregate1`
depends on the old derivation of `aggregate0` (i.e. the one without
rewritten constituents) and doesn't depend on `packages.foo` and
`packages.bar` because it was rewritten before `aggregate0` was
rewritten.

Using a toposort (derived from the one in CppNix's libutil) to solve
that. Cycles are a hard error, then _all_ aggregates are failed to make
sure we don't finish one of those too early.

This also extracts the code into a few helper functions and a slightly
better data structure, otherwise this would've gotten too chaotic.
@Ericson2314
Copy link
Member

@Ma27 this sounds like a good plan to me. New features that also work / can be tested in nix-eval-jobs alone are much less costly.

If you can take over the nix-eval-jobs for hydra PR in addition to porting this feature to nix-eval-jobs, that would be even better, and I would personally appreciate getting that off my plate too. :)

Only thing to keep in mind is that I would like to do the nix-eval-jobs switchover prior to upgrading hydra to Nix 2.25, so please do the work off the 2.24 branch in the nix-eval-jobs repo.

@Ma27
Copy link
Member Author

Ma27 commented Dec 9, 2024

@Ericson2314 regarding the nix-eval-jobs migration: I just wrote a comment in the corresponding PR about the biggest blocker, the failing test. I think this is the biggest blocker you had, correct?

But as I said there, my schedule is also pretty packed, so I won't be able to push this over entirely, but at least the gcroot issue you were blcoked on should be solved as described by me.

I'll try to tackle porting these changes here to nix-eval-jobs as soon as I can.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants