Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a label causes breaking changes #337

Open
jonathanmorley opened this issue May 12, 2023 · 2 comments
Open

Adding a label causes breaking changes #337

jonathanmorley opened this issue May 12, 2023 · 2 comments

Comments

@jonathanmorley
Copy link

jonathanmorley commented May 12, 2023

When GitHub selects runners, it needs all labels provided by the job to match the labels on the runner.
When this construct provisions runners, it needs all labels on the provider to match the labels on the job.

This 'inversion' can result in issues when making changes to the labels.

Example

  1. A provider is configured with the following labels: ['codebuild']
  2. A repository exists with a job that requests the following labels: ['self-hosted', 'codebuild']
  3. So far, so good

Now, we want to start introducing ARM64 runners, so we create a new provider, and a new label-dimension.

  1. We now have two providers, configured with ['codebuild', 'x86-64'] and ['codebuild', 'arm64']
  2. The repository still exists requesting the following labels: ['self-hosted', 'codebuild']
  3. The orchestrator no longer knows how to route requests for those labels, so it falls into 'Unknown label'.
  4. No runners are provisioned, jobs start queuing up
  5. The repository attempts to resolve this by adding the 'x86-64' label to their job.
  6. PRs that add the label start correctly provisioning runners, but they get picked up by the previously queued jobs, and the PR jobs start queuing.

This feels related to #335, in that resolving that issue would cause different issues for the scenario above. In theory, after making the change to add the new label-dimension, the orchestrator would receive requests for ['self-hosted', 'codebuild'] and would provision runners at random between x86-64 and arm64 runners. This is not likely to be desired behaviour, but is arguably more correct from the perspective of the job saying "I don't care about architecture" than refusing to provision a runner at all.

Another potential resolution could be to document a procedure on how to safely add new labels to an existing suite of providers. Perhaps the Composite Provider mentioned in #335 would work here too

@kichik
Copy link
Member

kichik commented May 12, 2023

The generic infrastructure solution should work here too. Don't immediately delete the old provider. Leave 3 providers running until all jobs have been migrated to the two new ones. Instead of replacing ['codebuild'] with ['codebuild', 'x86-64'] and ['codebuild', 'arm64'], keep all three. Once all jobs have moved on to either of the new providers, then you can delete the old provider.

This feels related to #335, in that resolving that issue would cause different issues for the scenario above. In theory, after making the change to add the new label-dimension, the orchestrator would receive requests for ['self-hosted', 'codebuild'] and would provision runners at random between x86-64 and arm64 runners. This is not likely to be desired behaviour, but is arguably more correct from the perspective of the job saying "I don't care about architecture" than refusing to provision a runner at all.

Refusing to provision a runner at all is the desired behavior. We have users with multiple runner setups in multiple accounts pointing to the same GitHub organization. See #181 (comment), #133, and #72 (comment). And as you said, random behavior is not very desired (for example #181).

Documenting how to change a label, and probably some other common processes, is a good idea. Not sure where I would put it yet.

@quad
Copy link
Contributor

quad commented May 13, 2023

This behaviour was surprising for us too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants