Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use self-hosted org runners to speed up CI #3065

Open
MabezDev opened this issue Jan 30, 2025 · 5 comments
Open

Use self-hosted org runners to speed up CI #3065

MabezDev opened this issue Jan 30, 2025 · 5 comments
Assignees
Labels
CI Continuous integration/deployment
Milestone

Comments

@MabezDev
Copy link
Member

We should be able to add the following runner names to the CI workflow:

  • macos-m1-self-hosted
  • linux-x86_64-self-hosted
  • windows-x86_64-self-hosted

See the github documentation

@MabezDev MabezDev added the CI Continuous integration/deployment label Jan 30, 2025
@bugadani bugadani self-assigned this Jan 30, 2025
@MabezDev MabezDev added this to the 1.0.0-beta.1 milestone Jan 30, 2025
@bugadani
Copy link
Contributor

bugadani commented Jan 31, 2025

So, the runners are (at least the macos one) zippy, but:

  • We only have 3 self-hosted runners, as opposed to a large number of github-hosted ones
  • Even if our runners are 2x as fast as the github runners (which is reasonable), we are testing 7 chips, so we'd be at a net loss if we just replaced the runners blindly.
  • We are running every check on every PR, which is fairly excessive.
  • At least on my PC, CPU usage is rather low when I run a command like xtask build-examples.
  • A common point of time waste is installing and reinstalling the Xtensa toolchain (it's around a minute on the github runners). On self-hosted runners, we would only have to do it once per release, provided that espup can actually not fumble an update.

We can probably improve CI times if we didn't do every single job on separate runners, but instead tried to parallelize the tasks on the same runner. We could probably easily build 2 or 3 chips on the macos runner alone, in the same time it takes to build one.

We are building each and every example for every PR. cargo check is 3-10x less time than cargo build, I think we can save ~4-5 minutes on each runner by not building everything. I appreciate the concern that cargo build catches linker errors as well, but we are already building a large number of HIL tests, so I don't know how valuable this is.

We are testing all test cases on every PR, which is a waste of time. If a PR doesn't generate a different binary for a test than something that already passed, it shouldn't be run. I don't know how we should approach the bookkeeping here, though, but the HIL test runtime can only be optimized so much, and we'll just have more and more tests.

We should also look into running less checks on a PR push, and doing a more comprehensive set of checks in the merge queue. We should probably run HIL tests on PR update, with a bit more strict checking against warnings to catch the low hanging fruit, but we can leave linting and documentation builds to the MQ. We can also probably just replace the MSRV checks by building the HIL tests with the MSRV compilers.

@MabezDev
Copy link
Member Author

At least on my PC, CPU usage is rather low when I run a command like xtask build-examples

Maybe we can just drop in rayon here where we iterate the examples for a quick win 🤔.

Even if our runners are 2x as fast as the github runners (which is reasonable), we are testing 7 chips, so we'd be at a net loss if we just replaced the runners blindly.

Do you know if the order of specifying runs-on matters? Like we can prioritize using the self hosted runners and then fall back to github runners (I think we will always want this because your analysis only covers a single PR, not if we have 2-3 being pushed to at a time, or multiple in the queue).

A common point of time waste is installing and reinstalling the Xtensa toolchain (it's around a minute on the github runners). On self-hosted runners, we would only have to do it once per release, provided that espup can actually not fumble an update

Can we cache this some how? I'm not sure if downloading from github runner cache or github cdn will make any difference, but might be work exploring.

@bugadani
Copy link
Contributor

bugadani commented Jan 31, 2025

Can we cache this some how? I'm not sure if downloading from github runner cache or github cdn will make any difference, but might be work exploring.

The self-hosted runners don't start from 0 every time. The work folder isn't cleared, the OS isn't set up from 0, there is a lot of state retained between runs. On GHA we can probably include the toolchain folder in the cache, but we'd run out of cache space. We have 10.02/10.00 GB currently used without that 🤣

Maybe we can just drop in rayon here where we iterate the examples for a quick win 🤔.

I'm more thinking about cargo-batch. The annoying part is always build logs, just shoving those into a single stream will be unreadable.

Do you know if the order of specifying runs-on matters? Like we can prioritize using the self hosted runners and then fall back to github runners (I think we will always want this because your analysis only covers a single PR, not if we have 2-3 being pushed to at a time, or multiple in the queue).

We can specify a number of tags, and the runners that matches ALL of them will run the job. I don't know if we have common tags between a GHA runner and a self-hosted one, but generally selecting a runner isn't very convenient.

@bjoernQ
Copy link
Contributor

bjoernQ commented Jan 31, 2025

We are building each and every example for every PR. cargo check is 3-10x less time than cargo build, I think we can save ~4-5 minutes on each runner by not building everything. I appreciate the concern that cargo build catches linker errors as well, but we are already building a large number of HIL tests, so I don't know how valuable this is.

I think we changed from check to build before we had hil-tests built - I agree we should catch most linker errors via the hil-tests. Additionally, we could do the build checks in nightly - just to be safe(er)

@bugadani
Copy link
Contributor

bugadani commented Jan 31, 2025

We can specify a number of tags, and the runners that matches ALL of them will run the job. I don't know if we have common tags between a GHA runner and a self-hosted one, but generally selecting a runner isn't very convenient.

Actually, a comment on https://stackoverflow.com/questions/77997951/can-i-specify-github-actions-runs-on-as-either-one-label-or-another-or-logic-in proposes a cheeky way to pick either self-hosted or gh-hosted runners: just label self-hosted runners with ubuntu-latest. It's unclear what github would prefer, I guess we can run an experiment to find out, although the result would be somewhat confusing when a windows machine would pick up a ubuntu-latest build.

Also I just realized we have 7 VMs in addition to the 3 self-hosted machines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous integration/deployment
Projects
Status: Todo
Development

No branches or pull requests

3 participants