Skip to content

flowey: new xflowey vmm-tests command #1405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

tjones60
Copy link
Contributor

A new xflowey command that builds everything necessary to run the VMM tests.

@tjones60 tjones60 requested review from a team as code owners May 22, 2025 22:48
@justus-camp-microsoft
Copy link
Contributor

Oh wow this is exciting - I've been wanting to look at trying to implement this

@tjones60
Copy link
Contributor Author

For example, to build everything for ARM and copy it to a windows directory:

cargo xflowey vmm-tests --target windows-aarch64 --dir /mnt/e/vmm_tests_aarch64 --build-only --copy-extras --install-missing-deps --unstable-whp

Then run on the destination machine:

$env:OPENVMM_LOG='debug,mesh_node=info'
$env:RUST_BACKTRACE='1'
$env:RUST_LOG='trace,mesh_node=info'
$env:TEST_OUTPUT_PATH='C:\vmm_tests\test_results'
$env:VMM_TESTS_CONTENT_DIR='C:\vmm_tests'
$env:VMM_TEST_IMAGES='C:\vmm_tests\images'
$env:WSLENV='WT_SESSION:WT_PROFILE_ID::OPENVMM_LOG:RUST_BACKTRACE:RUST_LOG:TEST_OUTPUT_PATH:VMM_TESTS_CONTENT_DIR:VMM_TEST_IMAGES'
.\cargo-nextest.exe nextest run `
	--profile default `
	--config-file 'C:\vmm_tests\nextest.toml' `
	--workspace-remap 'C:\vmm_tests\' `
	--archive-file 'C:\vmm_tests\vmm-tests-archive.tar.zst' `
	--filter-expr 'all()'

(I will document this and maybe include the command as a script file in the output dir)

@daniel5151
Copy link
Contributor

daniel5151 commented May 27, 2025

Out of curiosity - is the idea of having a flowey-backed petri_artifact_resolver still something that's on folk's radar?

This approach, where flowey drives nextest execution, has the unfortunate property of having to write and maintain an entirely new in-house CLI for executing tests (i.e: the new xflowey vmm-tests CLI), with all the overhead of maintaining the end-user docs, as well as ensuring the semantic mapping between the CLI and the underlying nextest filters / artifact dependencies is kept up-to-date as more tests are written*. Also, note that I've actually gone down this route before (back when I had that cargo xjob experiment that never went anyways), so this is stuff that I've already ran into / thought about previously.

The inverse approach, where users invoke cargo nextest run directly, and the test runner "shells-out" to flowey in order to JIT resolve required test artifacts, should be a lot easier to scale and maintain long-term, both from a user and maintainer perspective. Maintainers don't need to write a bespoke CLI (and corresponding docs), nor do they need to explicitly maintain any semantic maps between their bespoke CLI and the underlying tests / test-filters. Users can leverage existing open-source docs on using nextest / composing nextest filters, and sidestep needing to learn yet another project-specific tool.

There's also the "middle ground" approach, where there is a petri_artifact_resolver that could generate a test_artifact_manifest.json/toml file on a particular cargo nextest run invocation, which would generate / read-from a k/v list of artifacts to backing files (e.g: with the schema "key": " path/to/artifact" | "$unresolved"), and then have a cargo xflowey resolver-test-artifacts tool that could be used to resolve the required artifacts from the test_artifact_manifest file. This approach has the added benefit of making it really easy to slot in your own "private" artifacts (i.e: just swap the path to your custom artifact - no "magic path" nonesense required)

Anyways, just some food for thought. If this is the easiest on-ramp to enable simple local test runs, then obviously #shipit


*then again, you're already doing that in the CI case anyways, so as long as you've got good code-sharing between those two paths, maybe its not a big deal.

@jstarks
Copy link
Member

jstarks commented May 27, 2025

Having cargo nextest run transparently build is a bit awkward, because you'd be running a bunch of tests in parallel, which would then need to coordinate. Plus this would pollute the test log output terribly, especially if any of the build steps fail (which they often do during development).

So, I think we'd still want a wrapper command, even if we did more to automate the discovery of what needed to be built.

@daniel5151
Copy link
Contributor

daniel5151 commented May 27, 2025

Sounds like the hybrid, multi-command approach of having a test_artifact_manifest.json/toml file would be a good balance then?

i.e: a user workflow would look like:

  1. attempt to run a test (via cargo nextest run [user-test-filter])
  2. initial test run will notice that the test_artifact_manifest.json/toml hasn't been generated / contains "unresolved" entries, resulting in the test run terminating early (with a helpful error message)
  3. user then runs cargo xflowey resolve-test-artifacts path/to/test_artifact_manifest to build + resolve the missing artifacts
  4. user re-runs original command, with tests successfully running now that all artifacts have been built

If a "one click" wrapper is needed, it can certainly be implemented on-top of this workflow, but it really seems unfortunate to have yet another bespoke, complex in-house CLI for folks to memorize (vs. the relatively "simple" resolve-test-artifacts CLI I'm suggesting here).

@tjones60
Copy link
Contributor Author

Thanks for your input Daniel, I think that some sort of json manifest with an automatic artifact resolver would be great. Maybe I'll work on that next, but for now I think this is a good first step to making it easier to run the vmm tests. I think either way, we'll want to have a wrapper command that does everything, so that improvement could slot into this design without much change to the interface.

@daniel5151
Copy link
Contributor

Hey, you know me - I've got no skin in the game here, just sharing my thoughts from afar 😉

The only other food-for-thought question I'll leave is wrt the long-term maintenance overhead of this current approach.

As is, this approach seems to be taking a "procedural" approach to defining test classes + their corresponding artifact requirements, which means that as y'all onboard more tests, the wrapper will also need to be updated / tested to work correctly, right? Does that mean that subsequent PRs that introduce new tests will require manual, pre-commit validation from the author that they work correctly with xflowey vmm-tests? Is that just something that will be manually enforced?

If there's already buy-in with folks on the team that this will be part of the vmm_test authoring workflow, then sure, that seems workable. If not... it would be unfortunate if someone became the "xflowey vmm-tests cop", manually fixing up the infrastructure from "time to time".


I'll also note that - by going with a manifest based artifact resolver - you can always add something along the lines of a gh_actions_artifact_resolver for use in CI (or even locally?), which would even allow CI vmm-test runs to switch over to a "declarative" artifact wiring model. i.e: instead of needing to have the VMM tests job explicitly invoke a init_vmm_test_env job - the job could instead defer that logic to a runtime artifact resolver that can dynamically download artifacts generated by previous jobs in the pipeline run.

This is, of course, another stretch goal, but the point is that if you can execute on switching vmm-tests artifacts over to a declarative model for both local and in CI scenarios, you no longer need to worry (as much) about manually maintaining all these various test "classes" and "filters" and "artifact set". Instead - you simply declare what tests you want to run... and then the resolver figures everything out for you.

@tjones60 tjones60 force-pushed the xflowey_vmmtests branch from e8e50e1 to fff13c1 Compare May 30, 2025 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants