-
Notifications
You must be signed in to change notification settings - Fork 818
Es/taskify reexecution #4435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Es/taskify reexecution #4435
Conversation
- Remove custom action in favor of unified (self-hosted and native) workflow - Remove multiple configs - Encode configs in Taskfile
78a5964 to
f28890f
Compare
…1-250k` task on `blacksmith-4vcpu-ubuntu-2404` runner
…w Grafana Dashboard Link` because it applies non existing label
… self-hosted runs
… self-hosted runs
… self-hosted runs
… self-hosted runs
… self-hosted runs
…x path resolution
…x path resolution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR consolidates three separate C-Chain re-execution benchmark workflows into a single, self-contained workflow to support polyrepo migration. The changes eliminate custom GitHub Actions dependencies by inlining all logic directly into the workflow, making it portable across repositories without requiring custom action infrastructure.
Key Changes:
- Consolidated three workflows (native, container, and benchmark-specific) into one unified workflow
- Created five new Taskfile tasks for standardized benchmark execution
- Replaced custom GitHub Actions with inline workflow steps
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/c-chain-reexecution-benchmark.yml |
New unified workflow supporting both native and self-hosted runners with task-based configuration |
.github/workflows/c-chain-reexecution-benchmark-config.json |
Configuration file defining benchmark tasks, runners, and timeouts for different trigger events |
Taskfile.yml |
Added five benchmark task definitions and AWS region configuration for directory export |
.github/workflows/c-chain-reexecution-benchmark-gh-native.yml |
Removed in favor of unified workflow |
.github/workflows/c-chain-reexecution-benchmark-gh-native.json |
Removed in favor of unified configuration |
.github/workflows/c-chain-reexecution-benchmark-container.yml |
Removed in favor of unified workflow |
.github/workflows/c-chain-reexecution-benchmark-container.json |
Removed in favor of unified configuration |
.github/actions/c-chain-reexecution-benchmark/action.yml |
Removed custom action, logic inlined into workflow |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While a transition from the previous json-defined workflows might be desirable, is it required to enable running reexecute tests as part of firewood CI?
If not, maybe focus this PR only on ensuring tasks for each reexecute configuration and a reexecute custom action that is reusable across repos?
…taskify-reexecution # Conflicts: # .github/actions/c-chain-reexecution-benchmark/action.yml # .github/workflows/c-chain-reexecution-benchmark.yml # Taskfile.yml
…tput.txt` remove `inputs.workflow` consutriction
It's not required, but I think it composes together nicely and serves as verification that everything works. Here's my reasoning:
The workflow refactoring adds:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Taskfile tasks for each config (done)
- Reusable custom action (main deliverable)
The workflow refactoring adds:
- Immediate verification the custom action works
- Eliminates ~250 lines of duplication
I was not arguing against the potential benefit, but I'm still not convinced that this PR is the appropriate place to be making such invasive changes. The shift to a reusable action should have negligible impact on the workflows, and the shift to using tasks is also unlikely to require complicated changes to the workflows. Limiting the focus of this PR to those changes would make for a much easier PR.
Taking the initiative to reduce duplication is laudable, but I don't think such changes should have to come at the cost of making separable changes harder to review.
| @@ -0,0 +1,19 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe enforce that this file must be an exact copy of the run-monitored-tmpnet-cmd version to minimize the potential for unintended drift?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added sym link.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A symlink won't work when the action is used from other repos.
Edit: You'll need to confirm that a symlink will work when the action is used by other repos.
Taskfile.yml
Outdated
| cmds: | ||
| - task: reexecute-cchain-range-with-copied-data | ||
| vars: | ||
| TASK_NAME: c-chain-reexecution-hashdb-101-250k |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use {{.TASK}} as the value of TASK_NAME for this and all other modified tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the requirement? Adding {{ .TASK }} as TASK_NAME results in parent name vs name of the executed task.
reexecute-cchain-range-with-copied-data-20251029-130410
Taskfile.yml
Outdated
| TASK_NAME: c-chain-reexecution-hashdb-101-250k | ||
| START_BLOCK: 101 | ||
| END_BLOCK: 250000 | ||
| BLOCK_DIR_SRC: s3://avalanchego-bootstrap-testing/cchain-mainnet-blocks-1m-ldb/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No action required) Given that the protocol, bucket and /** file glob are all common, maybe only require that the object path (i.e. cchain-mainnet-blocks-1m-ldb) be specified in this and the other tasks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be part of the script and the script is made to be a bit more flexible.
You're right, and I apologize for overloading this PR. I should have been more thoughtful about separating concerns. Would it help if I split this into two PRs one for the task definitions + their immediate usage, and another for the reusable action refactor? I want to make this easier to review while still getting the validation benefits of CI. What would work best for you? |
I would like to see this PR focus only on necessary changes - a custom action that doesn't rely on relative imports, workflows that externalize reexecute configuration to tasks. Anything more than that can be deferred to one or more subsequent PRs. |
Why this should be merged
The polyrepo migration requires benchmark workflows to run in any repository (Firewood, Coreth, etc.) without depending on AvalancheGo infrastructure.
Current workflows use custom actions that reference themselves:
This breaks in other repos because they don't have AvalancheGo's custom actions directory. Copying actions to each repo creates duplication and maintenance burden.
This PR inlines all custom action logic directly into the workflow:
The result is a self-contained trivially locally reproducable workflow with zero custom action dependencies. It can be invoked to any repository and work immediately. Execution simplifies to:
./scripts/run_task.sh <task-name>with config from JSON.How this works
The workflow reads task configurations from
c-chain-reexecution-benchmark-config.jsonbased on the trigger event (pull_request, schedule, workflow_dispatch). It splits configurations into separate matrices for native GitHub runners and self-hosted ARC container runners to avoid compatibility issues.On pull requests, it runs fast benchmarks (101-250k blocks) for validation. On schedule, it runs long benchmarks (33m-33m500k blocks) on performance-optimized runners. Manual dispatch accepts inputs to override the config for ad-hoc testing.
To reproduce locally just execute a task defined in Taskfile.
How this was tested
CI
Need to be documented in RELEASES.md?
No - internal CI/CD infrastructure.