Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scylla-ansible-roles: Adds example playbook "kernel_version_enforcer" #402

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ebenzecri
Copy link
Contributor

"kernel_version_enforcer" playbook allow the user to:

  • Pin a specific kernel version (and ensure it will be picked in the next reboot) if required
  • Upgrade kernel version to the latest available
  • Purge all old kernel versions
  • Upgrade all upgradable packages

"kernel_version_enforcer" playbook allow the user to:
- Pin a specific kernel version (and ensure it will be picked in the next reboot) if required
- Upgrade kernel version to the latest available
- Purge all old kernel versions
- Upgrade all upgradable packages

Signed-off-by: Eduardo Benzecri <[email protected]>
@ebenzecri ebenzecri force-pushed the debian_ubuntu_image_kernel_pin branch from 24c063d to effc316 Compare August 15, 2024 18:23
Copy link
Collaborator

@vladzcloudius vladzcloudius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eduardo, as we discussed, it would be better to have this functionality integrated into the Node Role.

ansible.builtin.set_fact:
detected_image_version="{{ uname_pre_output.stdout_lines | first }}"

- name: Define if the kernel image should be installed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This task checks if the image_version is the one that is currently used OR if upgrade_latest_kernel is true.

The above is definitely not what the "name" claims because the package with the image_version may be installed but this Linux kernel may not be used.

kernel_related_packages:
- linux-gcp
- linux-image-gcp
- linux-headers-gcp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about linux-image-generic, linux-image-aws and corresponding headers packages?

become: true
when:
- kernel_image_required
- not upgrade_latest_kernel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you use upgrade_latest_kernel calculating kernel_image_required value if you are using condition like above?


- name: Define if the kernel image should be installed
ansible.builtin.set_fact:
kernel_image_required="{{ image_version is version(detected_image_version, 'ne') or upgrade_latest_kernel }}"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you have a "or upgrade_latest_kernel" part in the kernel_image_required value expression?

autoremove: true
force_apt_get: true
become: true
when: upgrade_all_packages
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you add this to this playbook which allegedly takes care of Linux kernel (only)?

selection: hold
loop: "{{ kernel_related_packages }}"
become: true
when: pin_kernel_version
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dpkg hold and pinning are not the same. You should align names of parameters (pin_kernel_version) to correspond.

selection: install
loop: "{{ kernel_related_packages }}"
become: true
when: reconfiguration_required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to unhold kernel related packages?

ansible.builtin.include_tasks: stop_reboot_start.yml
when: reconfiguration_required

- name: Set final kernel image version if '{{ image_version }}' was installed
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why all tasks below this one are required to begin with.


- name: Enforce kernel version '{{ final_image_version }}' usage
ansible.builtin.include_tasks: kernel_enforce_cleanup.yml
when: reconfiguration_required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using apt autoremove?

- name: Mask scylla-server service
ansible.builtin.systemd:
name: scylla-server
masked: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"masked"? Why?

ansible.builtin.set_fact:
scylla_installation="{{ true if ansible_facts.services['scylla-server.service'] is defined else false }}"

- name: Stop Scylla
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should stop duplicating "Scylla Rolling Restart" versions and have one that is used everywhere.
The best version of Scylla shutdown can be found in https://github.com/scylladb/scylla-ansible-roles/blob/master/example-playbooks/rolling_ops/rolling_custom_build_install.yml.

We should add the missing "rescue" block to https://github.com/scylladb/scylla-ansible-roles/blob/master/example-playbooks/rolling_ops/rolling_restart.yml and then it would be good to be used everywhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stop and start functionality, is indeed preferably implemented only once (DRY principle).
Perhaps as a task in ansible-scylla-node role, and perhaps it even makes sense to group such tasks in a folder. There is already such a task present start_one_node.yml.

@@ -0,0 +1,30 @@
---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Procedure below doesn't seem to be standard.
Here is an example of a procedure I'm aware of: https://askubuntu.com/questions/100232/how-do-i-change-the-grub-boot-order.

In gist you need to change GRUB_DEFAULT value in /etc/default/grub and then run sudo update-grub.

ansible.builtin.set_fact:
scylla_installation="{{ true if ansible_facts.services['scylla-server.service'] is defined else false }}"

- name: Stop Scylla
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stop and start functionality, is indeed preferably implemented only once (DRY principle).
Perhaps as a task in ansible-scylla-node role, and perhaps it even makes sense to group such tasks in a folder. There is already such a task present start_one_node.yml.

@@ -0,0 +1,118 @@
---
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this functionality should be inside a playbook. We will want to re-use this functionality when setting up a ScyllaDB node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants