Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] - Instance creation may complete before init scripts are finished #440

Open
HartS opened this issue Dec 10, 2023 · 2 comments
Open
Assignees
Labels

Comments

@HartS
Copy link

HartS commented Dec 10, 2023

I'm actually using Pulumi, but https://github.com/dirien/pulumi-vultr appears to be largely generated from the Vultr terraform provider.

I noticed with Ubuntu 22.04 that when I pulumi up with a user-data script that installs docker, the installation is still in progress when the command completes, the server_status is installingbooting, and docker is unavailable.

To Reproduce

With Pulumi, set the vultr:apiKey and privateKeyFile config, and pulumi up with the following Pulumi.yaml:

name: repro
description: repro the issue with not waiting for server_status=ok
runtime: yaml
template:
  description: Vultr API credentials
  config:
    vultr:apiKey:
      secret: true
resources:
  publicKey:
    type: command:local:Command
    properties:
      create: "ssh-keygen -yf ${privateKeyFile}"
  privateKey:
    type: command:local:Command
    properties:
      create: "cat ${privateKeyFile}"
    options:
      additionalSecretOutputs:
      - stdout
  sshkey:
    type: vultr:SSHKey
    properties:
      name: Main
      sshKey: ${publicKey.stdout}
    options:
      protect: true
  dev:
    type: vultr:Instance
    properties:
      # Ubuntu 22.04 x64
      osId: 1743
      plan: vhp-2c-4gb-amd
      region: sea
      sshKeyIds:
      - ${sshkey.id}
      backups: "disabled"
      enableIpv6: true
      hostname: jukejam-dev
      userData:
        fn::readFile: "./setup.sh"
  dockerPsOutput:
    type: command:remote:Command
    properties:
      connection:
        host: ${dev.mainIp}
        user: ubuntu
        privateKey: ${privateKey.stdout}
      create: "docker ps"

and setup.sh (which runs with cloud-init)

#!/usr/bin/env bash

cat << 'EOF' > /etc/sudoers.d/90-cloudimg-ubuntu
# ubuntu user is default user in cloud-images.
# It needs passwordless sudo functionality.
ubuntu ALL=(ALL) NOPASSWD:ALL
EOF

cat ~/.ssh/authorized_keys >> /home/ubuntu/.ssh/authorized_keys

# Install docker
sudo apt-get -y update
sudo apt-get -y install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get -y update
sudo apt-get -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

# Allow ubuntu user to run docker without sudo:
gpasswd -a ubuntu docker

Expected behavior
The dockerPsOutput output should contain "CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES" (the header from running docker ps)

Desktop (please complete the following information where applicable:

  • OS: Ubuntu 22.04
  • Language Version N/A
  • Browser N/A

Additional context

For reference, I added the following resource after dev:

  # Check the Vultr API until server_status is 'ok'
  devIsReady:
    options:
      dependsOn:
      - ${dev}
    type: command:local:Command
    properties:
      create: "while [[ $(curl -s https://api.vultr.com/v2/instances/${dev.id} -H 'Authorization: Bearer ${vultr:apiKey}' | jq -r .instance.server_status) != 'ok' ]]; do sleep 1; done"

and modified the dockerPsOutput resource:

    options:
      dependsOn:
      - ${devIsReady}

With the above changes, the upgrade now waits for server_status=OK, and the next step succeeds (however, it does introduce a ~7.5 minute delay, as the server status takes a while to transition out of installingbooting... this seems like a separate issue on Vultr's end)

Ideally there would be a way to configure the terraform provisioning to have it wait until cloud-init user scripts are finished; as a workaround, a later step can be added that runs cloud-init status --wait

@HartS HartS added the bug label Dec 10, 2023
@HartS
Copy link
Author

HartS commented Dec 10, 2023

Note: the reason I highlight waiting on server_status=OK is because it can be trivially waited on in resource_vultr_instance.go using the waitForServerAvailable function defined there. See master...HartS:terraform-provider-vultr:master

Given the extremely long wait time for server_status to transition to ok (compared to cloud-init status --wait which introduces a much more reasonable delay) I suspect this isn't currently the right approach

@optik-aper
Copy link
Member

@HartS Are you able to reproduce the issue when using the terraform provider directly? I just did a quick test that docker was installed after using your script in the userdata like so

resource "vultr_instance" "inst" {
  region = "mel"
  plan = "vc2-2c-4gb"
  label = "tf-ud-test"
  os_id = 1743
  tags = ["tf"]
  user_data = file("~/dump/setup.sh")
}

Where ~/dump/setup.sh is your script. After which, SSHing in and checking that docker is installed shows:
image

Can you check the user data from my.vultr.com to verify that the script is there in plaintext?

image

@optik-aper optik-aper self-assigned this Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants