-
Notifications
You must be signed in to change notification settings - Fork 1.6k
remove use of istioctl x wait #16515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
remove use of istioctl x wait #16515
Conversation
Hi @AritraDey-Dev. Thanks for your PR. I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
#istioctl ps | ||
#echo "TEST: wait for failed, but continuing." | ||
#fi | ||
echo "Duration: $(($(date +%s)-start)) seconds" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep this? Useful when we need to determine what tests are taking the longest and why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can keep this. Logging the duration helps identify which tests or waits are taking the longest. I’ll leave the commented-out istioctl experimental wait part for historical context, and keep the timing log to assist with test diagnostics.
Co-authored-by: Daniel Hawton <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
/retest |
/test doc.test.profile-default |
/retest |
Does this PR cause a meaningful change to the time the tests take to run? |
tests/util/helpers.sh
Outdated
|
||
while (( attempt <= max_attempts )); do | ||
# Check if the resource exists | ||
if kubectl get "$kind" "$name" -n "$namespace" >/dev/null 2>&1; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't you basically doing
kubectl wait --for=create -n $namespace $kind/$name --timeout=30s
with this whole function?
Perhaps that would be cleaner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what some of the other functions do.
The thing here is, that the goal of this wasn't so much to ensure that the configuration was created in Kubernetes, which either method checks for, but to ensure all the sidecars were programmed with it. They generally are, but in the instance that the object is created but is not translated to XDS and sent to the other proxy servers, there's no obvious way to tell.
So kubectl wait
is probably necessary but not sufficient.
Congratulations, your testing has found a bug! In #13208 the doc was changed to not require I'll change the test to only check for the services that are used by this example. The cleanup will remove all of them, whether they exist or not. |
That one feels a little more like a flake I've seen before. /retest |
Thank you for catching that and updating the test accordingly! |
np. Before we merge, please change this function to use |
I have updated the implementation to use kubectl wait, and also renamed the function to _wait_for_resource throughout all the tests. |
This reverts commit c10ec0c.
@craigbox Renaming the function from |
That one is the known flake again. /retest |
/test doc.test.profile-default |
Every failure is about `httpbin, and smells like it isn't being given enough time to start. Our options here are (a) always sleep 1s, and hope that is enough (b) sleep something like 5s in every test where we install Also - you've changed all the files in the |
/retest |
Now you're truly starting to understand what it is to be an open source maintainer. Endless /retests of something which seemed at first like a one line PR 😁 |
tests/util/helpers.sh
Outdated
echo "Duration: $(( $(date +%s) - start_time )) seconds" | ||
return 1 | ||
fi | ||
echo "$kind $name in namespace $namespace is present." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this output (or duration) if we succeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally disagree here.. it was in the original, and I've used it before when I've needed to troubleshoot why a test was taking too long. It's a clue into why a specific step took long, was it waiting a long time for a certain resource to be ready? I don't think it'd hurt the output to leave it, but might help leave a breadcrumb that might help someone again later on (we do a really good job at masking all hints/clues into test failures, I'd rather not remove another hint)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, leaving for @dhawton thoughts.
This may flake a little more often due to losing the sleep 1s
, but I saw a flake today with that in place anyway. It is probably worth fixing those separately and we have another LFX student starting soon to help with that.
Interesting error
Either this is a flake, or it's confused by (a) there being two different types of Gateway (b) waiting on (Again, this shows why |
/test doc.test.profile-default |
/retest |
As we're encountering edge cases where config propagation is still flaky, I guess we could consider adding an optional advanced wait mechanism in the future (e.g., |
/test doc.test.profile-default |
@AritraDey-Dev: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Description
fixes #16429
Reviewers