-
Notifications
You must be signed in to change notification settings - Fork 517
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Uploading snapshot always gets stuck with no info #2667
Comments
Hey @grosser - thanks for reporting this. I'm unable to reproduce the failure locally, so I think we'll have to try to see what might be different in your environment that could be causing this error. Based on the output, it looks like this is sitting somewhere in the https://github.com/awslabs/coldsnap/blob/develop/src/upload.rs#L171-L188 That's looping through each block of the snapshot and uploading to EBS. It hasn't out right failed (yet) so I think it's in that retry back off. If I'm reading that right, if it fails uploading a block it will retry up to It may be interesting to let is sit in this state for some time and see if it ever does completely time out and give any kind of useful error message. Another option would be to run the coldsnap command directly. Unfortunately it doesn't look like there is anything like a You could also try to manually upload the img directly with EBS and see if that provides any additional information. It's possible that could fail too, and if so, provide a better message of what is happening that is causing this problem. |
I'm not sure what the total time math comes out to, but some kind of feedback after >2-5min "Warning: retrying failed connection" would be nice :) |
Ok found it ... permission error, so the bug is that it should not retry on these
|
... and it should also not retry for 2 hours 🤦 |
some documentation for "here are all the aws permissions you need" would be nice too |
I've filed an issue in coldsnap with these details: awslabs/coldsnap#216 I'm only familiar with that code as much as it took to track down where it was failing, but I'll see if I can dig in there and make some of those changes. I'm going to close this issue as the fix will need to be in the coldsnap tool and there's nothing in bottlerocket we change to workaround it. |
and it just sits there ...
on multiple tries the number is always different
90/4096
or62/4096
Platform I'm building on:
What I expected to happen:
Upload to succeed or fail with an error message.
What actually happened:
Upload gets stuck.
How to reproduce the problem:
make ami
+
The text was updated successfully, but these errors were encountered: