Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore improvement: unpin agent CPU #4073

Merged
merged 9 commits into from
Oct 22, 2024
Merged

Conversation

Michal-Leszczynski
Copy link
Collaborator

This PR adds new flag --unpin-agent-cpu to restore task (default false).
It allows to unpinn agent from CPUs during restore (and re-pin it afterwards).

Moreover, for improved granularity, this PR also extends the cpu field of
scylla-manager-agent.yaml config, so that user can specify a list of cpus
instead of just a single one. This allows for greater, but manual, control
over agent cpu pinning.

Fixes #3951

This commit makes it possible to specify multiple
CPUs in 'scylla-manager-agent.yaml' config to which
agent will be pinned. The 'cpu' field in config
now allows for both single int and array of ints
values. The default behavior remains the same.
Except for the extraction, this commit also increases log
level (from DEBUG to INFO) of message about missing cpuset file.
It also makes this function return an error, which is going
to be useful when this function is going to be used inside
the cpu pinning agent endpoints.
@Michal-Leszczynski
Copy link
Collaborator Author

@karol-kokoszka This PR is ready for review!
The tests already passed (except for the IPV6 ones, but that's because I forgot to modify their test config - now they should pass as well).

Copy link
Collaborator

@karol-kokoszka karol-kokoszka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Thanks.
Please just use request.Context() in API call.

pkg/config/agent/agent.go Show resolved Hide resolved
This commit implement 3 cpu pinning endpoints
allowing to query (GET), pin (POST) and
unpin (DELETE) agent cpus.
This commit adds UnpinAgentCPU field to Target.
It optionally allows to unpin agent from CPUs
for the time of the restore.

Fixes #3951
This should be done just because of safety.
It prevents rare edge cases like:
- restore runs with --allow-compaction=false
- restore is paused
- restore is modified with --allow-compaction=true
- restore is resumed
This commit allows user to control whether
agent should be pinned to CPUs during restore.
…th cpu pinning

This way this test also checks cpu pinning before and after backup.
It also checks cpu pinning before, in the middle, when paused,
when resumed, and after restore.
@Michal-Leszczynski Michal-Leszczynski merged commit c720770 into master Oct 22, 2024
52 checks passed
@Michal-Leszczynski Michal-Leszczynski deleted the ml/restore-cpu branch October 22, 2024 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow for more pinned CPUs in the context of full speed restore
2 participants