Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeout on remote exec #424

Closed
LesnyRumcajs opened this issue Mar 12, 2024 · 1 comment
Closed

Timeout on remote exec #424

LesnyRumcajs opened this issue Mar 12, 2024 · 1 comment
Labels
Priority: P1 Added to issues and PRs relating to a high severity bugs.

Comments

@LesnyRumcajs
Copy link
Member

Issue summary

While deploying node or snapshot services, Terraform seems to have had an issue with SSH connectivity. For example, this succeeded on 3rd attempt; previous ones reported a timeout.

digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH...
digitalocean_droplet.forest (remote-exec):   Host: 209.38.234.101
digitalocean_droplet.forest (remote-exec):   User: root
digitalocean_droplet.forest (remote-exec):   Password: false
digitalocean_droplet.forest (remote-exec):   Private key: false
digitalocean_droplet.forest (remote-exec):   Certificate: false
digitalocean_droplet.forest (remote-exec):   SSH Agent: true
digitalocean_droplet.forest (remote-exec):   Checking Host Key: false
digitalocean_droplet.forest (remote-exec):   Target Platform: unix
digitalocean_droplet.forest: Still creating... [5m0s elapsed]
digitalocean_droplet.forest: Still creating... [5m10s elapsed]
digitalocean_droplet.forest: Still creating... [5m20s elapsed]
digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH...
digitalocean_droplet.forest (remote-exec):   Host: 209.38.234.101
digitalocean_droplet.forest (remote-exec):   User: root
digitalocean_droplet.forest (remote-exec):   Password: false
digitalocean_droplet.forest (remote-exec):   Private key: false
digitalocean_droplet.forest (remote-exec):   Certificate: false
digitalocean_droplet.forest (remote-exec):   SSH Agent: true
digitalocean_droplet.forest (remote-exec):   Checking Host Key: false
digitalocean_droplet.forest (remote-exec):   Target Platform: unix
digitalocean_droplet.forest: Still creating... [5m30s elapsed]
digitalocean_droplet.forest: Still creating... [5m40s elapsed]
digitalocean_droplet.forest: Still creating... [5m50s elapsed]
digitalocean_droplet.forest: Still creating... [6m0s elapsed]
digitalocean_droplet.forest (remote-exec): Connecting to remote host via SSH...
digitalocean_droplet.forest (remote-exec):   Host: 209.38.234.101
digitalocean_droplet.forest (remote-exec):   User: root
digitalocean_droplet.forest (remote-exec):   Password: false
digitalocean_droplet.forest (remote-exec):   Private key: false
digitalocean_droplet.forest (remote-exec):   Certificate: false
digitalocean_droplet.forest (remote-exec):   SSH Agent: true
digitalocean_droplet.forest (remote-exec):   Checking Host Key: false
digitalocean_droplet.forest (remote-exec):   Target Platform: unix
digitalocean_droplet.forest: Still creating... [6m10s elapsed]
╷
│ Error: remote-exec provisioner error
│ 
│   with digitalocean_droplet.forest,
│   on main.tf line 50, in resource "digitalocean_droplet" "forest":
│   50:   provisioner "remote-exec" {
│ 
│ timeout - last error: dial tcp 209.38.234.101:22: i/o timeout
╵
time=2024-03-12T08:05:19Z level=error msg=terraform invocation failed in /home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet/.terragrunt-cache/NHFD3q0GdGpJF-apYUdKkrp8WkU/bKi-1jljNp0vP3Ch1WPabqRMasU prefix=[/home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet] 
time=2024-03-12T08:05:19Z level=error msg=1 error occurred:
	* [/home/runner/work/forest-iac/forest-iac/tf-managed/live/environments/prod/applications/forest-butterflynet/.terragrunt-cache/NHFD3q0GdGpJF-apYUdKkrp8WkU/bKi-1jljNp0vP3Ch1WPabqRMasU] exit status 1

It also happened a week earlier in the snapshot service deployment.

There are a few possible culprits:

  • Digital Ocean issues,
  • too low timeout in provisioning (can it be extended?),
  • logic error somewhere,
  • something else.

This may create zombie instances where the initialization script was not run.

It'd be great to resolve the root issue, but automatically retrying a few times is also acceptable as a workaround.

Other information and links

@LesnyRumcajs LesnyRumcajs added the Priority: P1 Added to issues and PRs relating to a high severity bugs. label Mar 27, 2024
@samuelarogbonlo
Copy link
Contributor

samuelarogbonlo commented May 3, 2024

we can definitely increase timeout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: P1 Added to issues and PRs relating to a high severity bugs.
Projects
None yet
Development

No branches or pull requests

2 participants