Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc: ubuntu-minimal VM images do not support cloud-init as claimed #14605

Open
rptb1 opened this issue Dec 7, 2024 · 10 comments
Open

doc: ubuntu-minimal VM images do not support cloud-init as claimed #14605

rptb1 opened this issue Dec 7, 2024 · 10 comments
Assignees
Labels
Bug Confirmed to be a bug External Issue is about a bug/feature in another project
Milestone

Comments

@rptb1
Copy link

rptb1 commented Dec 7, 2024

ubuntu-minimal VM images do not support cloud-init.

I do not know whether this is a documentation issue, or a bug with the image builds.

If they're meant to support cloud-init, then it's a bug in the image builds, in which case please forward this or direct me to where I can report that. Otherwise the documentation is wrong.

ubuntu-minimal:

This server provides official Ubuntu Minimal images. All images are cloud images, which means that they include both cloud-init and the lxd-agent.

Reproduction:

  1. Create a LXD profile with a simple cloud-init test, such as https://documentation.ubuntu.com/lxd/en/latest/cloud-init/#run-commands
  2. Try lxc launch ubuntu-minimal: test-container --profile default --profile test-profile. Wait a bit. Try lxc exec test-container -- ls /run and note that "cloud.init.ran" exists. It works in containers.
  3. Try lxc launch --vm ubuntu-minimal: test-container2 --profile default --profile test-profile. Wait a bit. Try lxc exec test-container2 -- ls /run and note that "cloud.init.ran" does not exist. It doesn't work in minimal VM.
  4. Try lxc launch --vm ubuntu: test-container3 --profile default --profile test-profile. Wait a bit. Try lxc exec test-container3 -- ls /run and note that "cloud.init.ran" exists. It does work in non-minimal VM.

An example non-working image is 526b11bb926ebe8a1d05e5f69b02d4a311b311a9a9acfe760f210ef8d45c2bc6 .


Document: reference/remote_image_servers.md

@tomponline
Copy link
Member

Which image os version did you use?

@tomponline
Copy link
Member

@holmanb @blackboxsw this does appear to be a problem with the ubuntu-minimal:24.04 VM image indeed.

I've confirmed the issue doesn't affect ubuntu-minimal:22.04 and ubuntu:24.04 VM images.

Any ideas what could be going on here? Is cloud-init broken in the 24.04 minimal image somehow?

@tomponline tomponline added the External Issue is about a bug/feature in another project label Dec 9, 2024
@holmanb
Copy link
Member

holmanb commented Dec 9, 2024

Any ideas what could be going on here? Is cloud-init broken in the 24.04 minimal image somehow?

I was able to reproduce the issue with the example failing image.

Using a modified version of cloud-init's python detection code:

def is_platform_viable() -> bool:
    """Return True when this platform appears to have an LXD socket."""
    if not os.path.exists(LXD_SOCKET_PATH):
        LOG.warning(f"{LXD_SOCKET_PATH} does not exist")
        return False
    if not stat.S_ISSOCK(os.lstat(LXD_SOCKET_PATH).st_mode):
        LOG.warning(f"{LXD_SOCKET_PATH} is not a socket: {os.lstat(LXD_SOCKET_PATH).st_mode}")
        return False
    return True

I see the issue logged:

2024-12-09 17:25:09,172 - DataSourceLXD.py[WARNING]: /dev/lxd/sock does not exist

It looks like the lxd socket doesn't exist when cloud-init's Python code is running. ds-identify correctly identifies LXD as the datasource as a systemd-generator, but the later python code doesn't see it there.

In the lxd agent logs I notice that pam is logging an error:

-- Boot 0f67701eac764777a4c03daad8595aec --
Dec 09 17:24:28 minimal-vm systemd[1]: Starting lxd-agent.service - LXD - agent...
Dec 09 17:24:29 minimal-vm lxd-agent[346]: time="2024-12-09T17:24:29Z" level=info msg=Starting
Dec 09 17:24:29 minimal-vm lxd-agent[346]: time="2024-12-09T17:24:29Z" level=info msg="Loading vsock module"
Dec 09 17:24:29 minimal-vm lxd-agent[346]: time="2024-12-09T17:24:29Z" level=info msg="Started vsock listener"
Dec 09 17:24:29 minimal-vm systemd[1]: Started lxd-agent.service - LXD - agent.
Dec 09 17:24:29 minimal-vm su[408]: (to root) root on pts/0
Dec 09 17:24:29 minimal-vm su[408]: pam_unix(su-l:session): session opened for user root(uid=0) by (uid=0)
Dec 09 17:24:29 minimal-vm su[408]: pam_systemd(su-l:session): Failed to connect to system bus: No 
such file or directory
Dec 09 17:24:57 minimal-vm su[408]: pam_unix(su-l:session): session closed for user root

I'm not sure if that is related.

Does the lxd agent modify or remove the socket after generators run? Possibly this fails due to a missing dependency which causes the above error?

@tomponline
Copy link
Member

It doesnt explain why it works in ubuntu:24.04 though, they should be the same from lxds perspective.

@tomponline
Copy link
Member

Maybe @simondeziel might know a difference in ubuntu 24.04 minimal

@holmanb
Copy link
Member

holmanb commented Dec 9, 2024

@tomponline I also noticed that the agent doesn't even appear to run on non-minimal images:

○ lxd-agent.service - LXD - agent
     Loaded: loaded (/usr/lib/systemd/system/lxd-agent.service; static)
     Active: inactive (dead)
       Docs: https://documentation.ubuntu.com/lxd/en/latest/

vs a minimal vm:

root@minimal-vm:~# systemctl status lxd-agent
● lxd-agent.service - LXD - agent
     Loaded: loaded (/usr/lib/systemd/system/lxd-agent.servic
e; static)
     Active: active (running) since Mon 2024-12-09 18:26:55 UTC; 2min 59s ago
       Docs: https://documentation.ubuntu.com/lxd/en/latest/
    Process: 280 ExecStartPre=/lib/systemd/lxd-agent-setup (code=exited, status=0/SUCCESS)
   Main PID: 357 (lxd-agent)
      Tasks: 8 (limit: 1124)
     Memory: 37.4M (peak: 39.3M)
        CPU: 400ms
     CGroup: /system.slice/lxd-agent.service
             └─357 /run/lxd_agent/lxd-agent

Dec 09 18:26:55 minimal-vm lxd-agent[357]: time="2024-12-09T18:26:55Z" level=info msg=Starting
Dec 09 18:26:55 minimal-vm lxd-agent[357]: time="2024-12-09T18:26:55Z" level=info msg="Loading vsock module"
Dec 09 18:26:55 minimal-vm lxd-agent[357]: time="2024-12-09T18:26:55Z" level=info msg="Started vsock listener"
Dec 09 18:26:55 minimal-vm systemd[1]: Started lxd-agent.service - LXD - agent.
Dec 09 18:26:57 minimal-vm su[438]: (to root) root on pts/0
Dec 09 18:26:57 minimal-vm su[438]: pam_unix(su-l:session): session opened for user root(uid=0) by (uid=0)
Dec 09 18:26:57 minimal-vm su[438]: pam_systemd(su-l:session): Failed to connect to system bus: No 
such file or directory
Dec 09 18:27:18 minimal-vm su[438]: pam_unix(su-l:session): session closed for user root
Dec 09 18:29:48 minimal-vm su[608]: (to root) root on pts/0
Dec 09 18:29:48 minimal-vm su[608]: pam_unix(su-l:session): session opened for user root(uid=0) by (uid=0)

@tomponline
Copy link
Member

@simondeziel as youre familiar with the lxd-agent package please can you take a look at this, looks like its units are not firing right (although i can get in fine) and potentially starting too late for lxd.

@tomponline tomponline added this to the lxd-6.3 milestone Dec 9, 2024
@tomponline tomponline added the Bug Confirmed to be a bug label Dec 9, 2024
@simondeziel
Copy link
Member

Here with LXD 5.21/stable, ubuntu-minimal: works:

root@v3:~# lxc launch --vm ubuntu-minimal: minimal-vm1 --profile default --profile new-keys
Creating minimal-vm1
Starting minimal-vm1
root@v3:~# lxc exec minimal-vm1 -- cloud-init status --wait
Error: LXD VM agent isn't currently running
root@v3:~# lxc exec minimal-vm1 -- cloud-init status --wait
Error: LXD VM agent isn't currently running
root@v3:~# lxc exec minimal-vm1 -- cloud-init status --wait
...........................
status: done
root@v3:~# lxc exec minimal-vm1 -- ls /run/cloud.init.ran
/run/cloud.init.ran

That's using the current default image (Noble, eb6632b7bffc386a3dbf81570d813f4133009dc7ea1cb640215fafc481f7d670) and with the expanded config:

root@v3:~# lxc config show -e minimal-vm1
architecture: x86_64
config:
  cloud-init.user-data: |
    #cloud-config
    runcmd:
      - [touch, /run/cloud.init.ran]
  image.architecture: amd64
  image.description: ubuntu 24.04 LTS amd64 (minimal release) (20241211)
  image.label: minimal release
  image.os: ubuntu
  image.release: noble
  image.serial: "20241211"
  image.type: disk1.img
  image.version: "24.04"
  volatile.base_image: eb6632b7bffc386a3dbf81570d813f4133009dc7ea1cb640215fafc481f7d670
  volatile.cloud-init.instance-id: 79056b7c-27b2-40da-a58f-5e62dc465257
  volatile.eth0.host_name: tap5b5a0b1c
  volatile.eth0.hwaddr: 00:16:3e:33:c0:78
  volatile.last_state.power: RUNNING
  volatile.uuid: 6bbfc369-767c-4048-84a1-3cd134adfcee
  volatile.uuid.generation: 6bbfc369-767c-4048-84a1-3cd134adfcee
  volatile.vsock_id: "3132111516"
devices:
  eth0:
    name: eth0
    network: lxdbr0
    type: nic
  root:
    path: /
    pool: default
    type: disk
ephemeral: false
profiles:
- default
- new-keys
stateful: false
description: ""

Now trying with the exact image you tried (526b11bb926ebe8a1d05e5f69b02d4a311b311a9a9acfe760f210ef8d45c2bc6), it still works here:

root@v3:~# lxc launch --vm ubuntu-minimal:526b11bb926ebe8a1d05e5f69b02d4a311b311a9a9acfe760f210ef8d45c2bc6 minimal-vm2 --profile default --profile new-keys
Creating minimal-vm2
Starting minimal-vm2                          
root@v3:~# lxc exec minimal-vm2 -- cloud-init status --wait

status: done
root@v3:~# lxc exec minimal-vm2 -- ls /run/cloud.init.ran
/run/cloud.init.ran

@rptb1 I added a cloud-init status --wait to be sure to wait for long enough for cloud-init to do its job. Maybe you just raced it and concluded it didn't work? Or maybe I'm just unable to reproduce it.

@rptb1
Copy link
Author

rptb1 commented Dec 21, 2024

Here is an exact paste of me reproducing the issue just now. If there is anything more I can do locally to help debug this please let me know.

rb@kiwi:~$ lxc profile show test-profile
name: test-profile
description: ""
config:
  cloud-init.user-data: |
    #cloud-config
    runcmd:
      - [touch, /run/cloud.init.ran]
devices: {}
used_by: []
rb@kiwi:~$ lxc launch -s kiwi-tmp --vm ubuntu-minimal: test-container2 --profile default --profile test-profile
Launching test-container2
rb@kiwi:~$ lxc exec test-container2 -- cloud-init status --wait

status: done
rb@kiwi:~$ lxc exec test-container2 -- ls /run/cloud.init.ran
ls: cannot access '/run/cloud.init.ran': No such file or directory
rb@kiwi:~$ lxc config show test-container2
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 24.04 LTS amd64 (minimal release) (20241211)
  image.label: minimal release
  image.os: ubuntu
  image.release: noble
  image.serial: "20241211"
  image.type: disk1.img
  image.version: "24.04"
  volatile.base_image: eb6632b7bffc386a3dbf81570d813f4133009dc7ea1cb640215fafc481f7d670
  volatile.cloud-init.instance-id: 72a11da0-1e49-4762-ba8f-3f62f63c46fb
  volatile.eth0.host_name: tap14b7ba95
  volatile.eth0.hwaddr: 00:16:3e:2e:93:50
  volatile.last_state.power: RUNNING
  volatile.uuid: 26aa50bd-c502-4946-a351-af065eb15bfc
  volatile.uuid.generation: 26aa50bd-c502-4946-a351-af065eb15bfc
  volatile.vsock_id: "4201320121"
devices:
  root:
    path: /
    pool: kiwi-tmp
    type: disk
ephemeral: false
profiles:
- default
- test-profile
stateful: false
description: ""
rb@kiwi:~$ snap list lxd
Name  Version      Rev    Tracking       Publisher   Notes
lxd   6.2-afb00d0  31571  latest/stable  canonical✓  -
rb@kiwi:~$ uname -a
Linux kiwi 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
rb@kiwi:~$ lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy
rb@kiwi:~$ lxc launch -s kiwi-tmp --vm ubuntu: test-container3 --profile default --profile test-profile
Launching test-container3
rb@kiwi:~$ lxc exec test-container3 -- cloud-init status --wait
.........................
status: done
rb@kiwi:~$ lxc exec test-container3 -- ls /run/cloud.init.ran
/run/cloud.init.ran
rb@kiwi:~$ lxc exec test-container2 -- ls /run/cloud.init.ran
ls: cannot access '/run/cloud.init.ran': No such file or directory
rb@kiwi:~$ lxc stop test-container2
rb@kiwi:~$ lxc stop test-container3

@simondeziel
Copy link
Member

@rptb1 I still cannot reproduce despite using the same base OS, 22.04 with the same kernel and same LXD rev:

root@v1:~# lxc profile show test-profile
name: test-profile
description: ""
config:
  cloud-init.user-data: |
    #cloud-config
    runcmd:
      - [touch, /run/cloud.init.ran]
devices: {}
used_by: []

root@v1:~# lxc launch --vm ubuntu-minimal:eb6632b7bffc386a3dbf81570d813f4133009dc7ea1cb640215fafc481f7d670 vm1 --profile default --profile test-profile
Launching vm1
root@v1:~# lxc exec vm1 -- cloud-init status --wait

status: done
root@v1:~# lxc exec vm1 -- ls /run/cloud.init.ran
/run/cloud.init.ran

root@v1:~# lxc config show vm1
architecture: x86_64
config:
  image.architecture: amd64
  image.description: ubuntu 24.04 LTS amd64 (minimal release) (20241211)
  image.label: minimal release
  image.os: ubuntu
  image.release: noble
  image.serial: "20241211"
  image.type: disk1.img
  image.version: "24.04"
  volatile.base_image: eb6632b7bffc386a3dbf81570d813f4133009dc7ea1cb640215fafc481f7d670
  volatile.cloud-init.instance-id: 1100ed94-a9f9-4cfc-abe2-e97138ec5580
  volatile.eth0.host_name: tap783c977c
  volatile.eth0.hwaddr: 00:16:3e:be:7b:24
  volatile.last_state.power: RUNNING
  volatile.uuid: 945abbdb-fcba-4dad-ae42-d29908cb9fe2
  volatile.uuid.generation: 945abbdb-fcba-4dad-ae42-d29908cb9fe2
  volatile.vsock_id: "3932951384"
devices: {}
ephemeral: false
profiles:
- default
- test-profile
stateful: false
description: ""
root@v1:~# snap list lxd
Name  Version      Rev    Tracking     Publisher   Notes
lxd   6.2-afb00d0  31571  latest/edge  canonical✓  -
root@v1:~# uname -a
Linux v1 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
root@v1:~# lsb_release -a
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.5 LTS
Release:	22.04
Codename:	jammy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug External Issue is about a bug/feature in another project
Projects
None yet
Development

No branches or pull requests

4 participants