Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] overlay.d/05core: Writeable root fs for Live ISOs booted from RAM #2645

Open
wants to merge 1 commit into
base: testing-devel
Choose a base branch
from

Conversation

JM1
Copy link

@JM1 JM1 commented Oct 3, 2023

Previously, karg coreos.liveiso.fromram would cause live-generator to copy rootfs.img to a tmpfs and then mount it to /sysroot. Because rootfs.img contains a squashfs, /sysroot will be mounted read-only, preventing rpm-ostree operations such as install and rebase which are required by OKD/FCOS.

Now, with karg coreos.liveiso.fromram (Live ISO) or coreos.live.fromram (PXE boot) the rootfs.img will be mounted to /isoroot. The contents of /isoroot will be copied to /run/ephemeral and the latter will be bind-mounted to /sysroot. Because /run/ephemeral is a writeable xfs, both sysroot-etc.mount and sysroot-var.mount are not required in this case.

For example, to rebase a FCOS/OKD bootimage first boot a Live ISO with Fedora 39 from RAM and then rebase and soft-reboot (requires systemd v254) it with:

    rpm-ostree rebase fedora:fedora/x86_64/coreos/next
    rpm-ostree apply-live --allow-replacement
    systemctl soft-reboot

Wdyt?

Previously, karg coreos.liveiso.fromram would cause live-generator to
copy rootfs.img to a tmpfs and then mount it to /sysroot. Because
rootfs.img contains a squashfs, /sysroot will be mounted read-only,
preventing rpm-ostree operations such as install and rebase which are
required by OKD/FCOS [0].

Now, with karg coreos.liveiso.fromram (Live ISO) or coreos.live.\
fromram (PXE boot) the rootfs.img will be mounted to /isoroot. The
contents of /isoroot will be copied to /run/ephemeral and the latter
will be bind-\ mounted to /sysroot. Because /run/ephemeral is a
writeable xfs, both sysroot-etc.mount and sysroot-var.mount are not
required in this case.

For example, to rebase a FCOS/OKD bootimage first boot a Live ISO
with Fedora 39 from RAM and then rebase and soft-reboot [1] (requires
systemd v254) it with:

    rpm-ostree rebase fedora:fedora/x86_64/coreos/next
    rpm-ostree apply-live --allow-replacement
    systemctl soft-reboot

[0] coreos/rpm-ostree#4547
[1] https://www.freedesktop.org/software/systemd/man/systemd-soft-reboot.service.html
@JM1
Copy link
Author

JM1 commented Oct 3, 2023

Butane example which shows how a rebase of OKD/FCOS bootimage from FCOS39 to OKD Machine OS could be implemented:

variant: fcos
version: 1.4.0

storage:
  files:
    - path: /etc/systemd/system/demo.service
      mode: 0644
      contents:
        inline: |
          [Unit]
          Requires=ostree-prepare-root.service
          After=ostree-prepare-root.service
          ConditionPathExists=!/etc/.demo

          [Service]
          Type=oneshot
          ExecStart=/bin/sh -c 'date >> /etc/.demo'
          RemainAfterExit=yes

          [Install]
          WantedBy=multi-user.target

    - path: /usr/local/bin/rebase.sh
      mode: 0755
      contents:
        inline: |
          #!/bin/bash
          set -eux
          systemctl daemon-reload
          systemctl enable demo.service
          #rpm-ostree rebase fedora:fedora/x86_64/coreos/stable # Fedora 38
          rpm-ostree rebase fedora:fedora/x86_64/coreos/next # Fedora 39
          rpm-ostree apply-live --allow-replacement
          date >> /etc/.rebased
          systemctl soft-reboot # since systemd 254 / Fedora 39

systemd:
  units:
    - name: rebase.service
      enabled: true
      contents: |
        [Unit]
        Wants=network-online.target
        Requires=ostree-prepare-root.service
        After=ostree-prepare-root.service network-online.target
        ConditionPathExists=!/etc/.rebased
        [email protected] bootkube.service kubelet.service

        [Service]
        Type=oneshot
        ExecStart=/usr/local/bin/rebase.sh
        RemainAfterExit=yes

        [Install]
        WantedBy=multi-user.target

Launch with:

cosa run --qemu-iso builds/latest/x86_64/fedora-coreos-39.20231003.dev.0-live.x86_64.iso -m 16384 -c --qemu-firmware uefi --kargs coreos.liveiso.fromram --kargs rd.shell=1 --butane /srv/src/config/test.bu

@jlebon
Copy link
Member

jlebon commented Oct 4, 2023

Thanks for hacking on this stuff! It's a neat POC.

For example, to rebase a FCOS/OKD bootimage first boot a Live ISO with Fedora 39 from RAM and then rebase and soft-reboot (requires systemd v254) it with:

    rpm-ostree rebase fedora:fedora/x86_64/coreos/next
    rpm-ostree apply-live --allow-replacement
    systemctl soft-reboot

Wdyt?

Hmm, though systemctl soft-reboot doesn't update the kernel. That puts the node in a possibly untested/broken kernel+userspace combination. I think if we were to support live updates, we should scope in kernel updates too (via kexec), which would likely mean a completely different approach.

Looking at coreos/rpm-ostree#4547, I think indeed it'd be cleaner to focus on getting rpm-ostree install foobar --apply-live to work in the live environment for your use case (which as mentioned is also useful for non-live cases).

@JM1
Copy link
Author

JM1 commented Oct 5, 2023

Thanks for your feedback ☺️

The use case I am trying to tackle here is Agent-based Installer (ABI) and SNO Installer (non Agent-based code path in OpenShift Installer) for OKD/FCOS. Both installers launch a bootimage with a Ignition config which will then provision a cluster. OCP uses RHCOS as its bootimage and thus both installers happily use tools such as oc and crio.service which are part of RHCOS.

But OKD/FCOS uses plain FCOS as its bootimage which is missing those tools. OKD/FCOS has to change its bootimage contents from plain FCOS to OKD Machine OS before running the cluster installation services for ABI and SNO.

OKD Machine OS is FCOS plus kubelet, crio, oc and some config but still the same kernel. So systemctl soft-reboot should be safe. If we had to use rpm-ostree install ... we would have to maintain the same config (list of extra packages) in two different places, first in OKD Machine OS and second in OpenShift Installer. rpm-ostree rebase avoids this duplication. (Atm we are using poor man's version of rpm-ostree rebase to pull in the necessary tools.)

Not sure how we could do kernel updates with a Live ISO. kexec assumes that userspace is shutdown and all filesystems are unmounted. We would need a Live ISO for OKD Machine OS to boot into after calling kexec. But then we could have used it in the first place (OKD Machine OS as bootimage instead of plain FCOS).

@cgwalters
Copy link
Member

cgwalters commented Oct 5, 2023 via email

@JM1
Copy link
Author

JM1 commented Oct 24, 2023

@cgwalters Could you give us some advice/direction on how to continue with this, please?

@cgwalters
Copy link
Member

Hmm, though systemctl soft-reboot doesn't update the kernel. That puts the node in a possibly untested/broken kernel+userspace combination. I think if we were to support live updates, we should scope in kernel updates too (via kexec), which would likely mean a completely different approach.

Yes, actually in general w/ostree we'd need to be careful because we only ship one /usr/lib/modules in the userspace tree, so a soft reboot to a new commit/image that has a new kernel would drop out the modules.

Nothing logically stops us from a workflow that would union the kernel modules (much like how yum/dnf do it) but it gets messy.

@cgwalters
Copy link
Member

@cgwalters Could you give us some advice/direction on how to continue with this, please?

The simplest thing to do today is probably rpm-ostree usroverlay + rpm -Uvh. (Or alternatively, fetch a container image with the expected overlay and scrape its content out)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants