-
Couldn't load subscription status.
- Fork 130
Open
Labels
area/robustnessRobustness, reliability, resilience relatedRobustness, reliability, resilience relatedkind/bugBugBuglifecycle/rottenNobody worked on this for 12 months (final aging stage)Nobody worked on this for 12 months (final aging stage)needs/planningNeeds (more) planning with other MCM maintainersNeeds (more) planning with other MCM maintainerspriority/2Priority (lower number equals higher priority)Priority (lower number equals higher priority)
Description
How to categorize this issue?
/area robustness
/kind bug
/priority 2
What happened:
Currently MCM doesn't turn CrashLoopBackoff(CLBF) machines to Failed as soon as creationTimeout expires , but delays have been observed which can range to 2min to any time.
There are 2 parts of the problem:
- timeout check for CLBF machine is done after making
CreateMachine()driver call .CreateMachine()itself could take any amount of time as it provisions VM on the cloudprovider (in Azure big delays like 30min or more have been seen before) - even if the 1) doesn't exist to contribute to the delay, the retry/re-push to the queue can also introduce the delay of
ShortRetry(3min). This happens because the retry period is not calculated considering thetime left before timeoutbut is just a constant value ofShortRetry(3min)
What you expected to happen:
Turn to Failed as soon as timeout expires.
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
We could take inspiration from machineDeployment logic which turns Progressing condition to False with reason ProgressDeadlineExceeded as soon as the deadline exceeds.
Environment:
- Kubernetes version (use
kubectl version): - Cloud provider or hardware configuration:
- Others:
Metadata
Metadata
Assignees
Labels
area/robustnessRobustness, reliability, resilience relatedRobustness, reliability, resilience relatedkind/bugBugBuglifecycle/rottenNobody worked on this for 12 months (final aging stage)Nobody worked on this for 12 months (final aging stage)needs/planningNeeds (more) planning with other MCM maintainersNeeds (more) planning with other MCM maintainerspriority/2Priority (lower number equals higher priority)Priority (lower number equals higher priority)