Skip to content

remove use of istioctl x wait #16515

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

AritraDey-Dev
Copy link

@AritraDey-Dev AritraDey-Dev commented May 23, 2025

Description

fixes #16429

Reviewers

  • Ambient
  • Docs
  • Installation
  • Networking
  • Performance and Scalability
  • Extensions and Telemetry
  • Security
  • Test and Release
  • User Experience
  • Developer Infrastructure
  • Localization/Translation

@AritraDey-Dev AritraDey-Dev requested review from a team as code owners May 23, 2025 14:21
@istio-testing istio-testing added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test labels May 23, 2025
@istio-testing
Copy link
Contributor

Hi @AritraDey-Dev. Thanks for your PR.

I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

#istioctl ps
#echo "TEST: wait for failed, but continuing."
#fi
echo "Duration: $(($(date +%s)-start)) seconds"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we keep this? Useful when we need to determine what tests are taking the longest and why

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can keep this. Logging the duration helps identify which tests or waits are taking the longest. I’ll leave the commented-out istioctl experimental wait part for historical context, and keep the timing log to assist with test diagnostics.

@AritraDey-Dev AritraDey-Dev requested a review from dhawton May 23, 2025 14:59
Copy link
Member

@Arhell Arhell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/ok-to-test

@istio-testing istio-testing added ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. and removed needs-ok-to-test labels May 24, 2025
@AritraDey-Dev
Copy link
Author

/retest

@AritraDey-Dev
Copy link
Author

/test doc.test.profile-default

@craigbox
Copy link
Contributor

/retest

@craigbox
Copy link
Contributor

Does this PR cause a meaningful change to the time the tests take to run?


while (( attempt <= max_attempts )); do
# Check if the resource exists
if kubectl get "$kind" "$name" -n "$namespace" >/dev/null 2>&1; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't you basically doing

kubectl wait --for=create -n $namespace $kind/$name --timeout=30s

with this whole function?

Perhaps that would be cleaner?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what some of the other functions do.

The thing here is, that the goal of this wasn't so much to ensure that the configuration was created in Kubernetes, which either method checks for, but to ensure all the sidecars were programmed with it. They generally are, but in the instance that the object is created but is not translated to XDS and sent to the other proxy servers, there's no obvious way to tell.

So kubectl wait is probably necessary but not sufficient.

@craigbox
Copy link
Contributor

Congratulations, your testing has found a bug!

In #13208 the doc was changed to not require samples/bookinfo/networking/virtual-service-all-v1.yaml, so checking for productpage and details will fail as those are not actually created.

I'll change the test to only check for the services that are used by this example. The cleanup will remove all of them, whether they exist or not.

@craigbox craigbox requested a review from a team as a code owner May 27, 2025 00:05
@craigbox
Copy link
Contributor

That one feels a little more like a flake I've seen before.

/retest

@AritraDey-Dev
Copy link
Author

I'll change the test to only check for the services that are used by this example. The cleanup will remove all of them, whether they exist or not.

Thank you for catching that and updating the test accordingly!

@craigbox
Copy link
Contributor

np. Before we merge, please change this function to use kubectl wait and maybe even rename the function to _wait_for_resource throughout all the tests?

@istio-testing istio-testing added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 27, 2025
@AritraDey-Dev
Copy link
Author

AritraDey-Dev commented May 27, 2025

please change this function to use kubectl wait and maybe even rename the function to _wait_for_resource throughout all the tests?

I have updated the implementation to use kubectl wait, and also renamed the function to _wait_for_resource throughout all the tests.

@istio-testing istio-testing added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 27, 2025
@AritraDey-Dev
Copy link
Author

@craigbox Renaming the function from _wait_for_istio to _wait_for_resource appears to be causing some test failures. I'm currently unsure about the root cause. Would you recommend reverting the function name back to _wait_for_istio?

@craigbox
Copy link
Contributor

craigbox commented May 28, 2025

That one is the known flake again.
#16534

/retest

@craigbox
Copy link
Contributor

/test doc.test.profile-default

@craigbox
Copy link
Contributor

Every failure is about `httpbin, and smells like it isn't being given enough time to start.
Previously we always slept for at least one second; this PR removed that. That might be the div we are not doing that. I assume that is the difference.

Our options here are (a) always sleep 1s, and hope that is enough (b) sleep something like 5s in every test where we install httpbin. I'd like feedback from other docs maintainers here.

Also - you've changed all the files in the archive directory, which you probably shouldn't have. For now, I recommend you revert all the changes to the function name, and we can address that in a different PR.

@istio-testing istio-testing added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 28, 2025
@AritraDey-Dev
Copy link
Author

/retest

@craigbox
Copy link
Contributor

Now you're truly starting to understand what it is to be an open source maintainer.

Endless /retests of something which seemed at first like a one line PR 😁

@AritraDey-Dev AritraDey-Dev requested a review from craigbox May 28, 2025 12:09
echo "Duration: $(( $(date +%s) - start_time )) seconds"
return 1
fi
echo "$kind $name in namespace $namespace is present."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this output (or duration) if we succeed

Copy link
Member

@dhawton dhawton May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally disagree here.. it was in the original, and I've used it before when I've needed to troubleshoot why a test was taking too long. It's a clue into why a specific step took long, was it waiting a long time for a certain resource to be ready? I don't think it'd hurt the output to leave it, but might help leave a breadcrumb that might help someone again later on (we do a really good job at masking all hints/clues into test failures, I'd rather not remove another hint)

@istio-testing istio-testing added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels May 29, 2025
@AritraDey-Dev AritraDey-Dev requested a review from craigbox May 29, 2025 09:15
Copy link
Contributor

@craigbox craigbox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, leaving for @dhawton thoughts.

This may flake a little more often due to losing the sleep 1s, but I saw a flake today with that in place anyway. It is probably worth fixing those separately and we have another LFX student starting soon to help with that.

@craigbox
Copy link
Contributor

craigbox commented May 29, 2025

https://prow.istio.io/view/gs/istio-prow/pr-logs/pull/istio_istio.io/16515/doc.test.profile-default_istio.io/1928015336469172224

https://storage.googleapis.com/istio-prow/pr-logs/pull/istio_istio.io/16515/doc.test.profile-default_istio.io/1928015336469172224/artifacts/tests-setup-profile-default-a89/TestDocs/tasks/traffic-management/ingress/secure-ingress/test.sh/test.sh/_test_context/test.sh_debug.txt

Interesting error

+_wait_for_istio gateway default mygateway
+_wait_for_istio(): local kind=gateway
+_wait_for_istio(): local namespace=default
+_wait_for_istio(): local name=mygateway
+_wait_for_istio(): kubectl wait --for=create -n default gateway/mygateway --timeout 30s
+_wait_for_istio virtualservice default helloworld-v1
+_wait_for_istio(): local kind=virtualservice
+_wait_for_istio(): local namespace=default
+_wait_for_istio(): local name=helloworld-v1
+_wait_for_istio(): kubectl wait --for=create -n default virtualservice/helloworld-v1 --timeout 30s
+_wait_for_istio(): echo 'Timed out waiting for virtualservice helloworld-v1 in namespace default to be created.'
+_wait_for_istio(): exit 1

Either this is a flake, or it's confused by (a) there being two different types of Gateway (b) waiting on create where a CRD should be condition=established ?

(Again, this shows why _wait_for_istio and kubectl wait are not the same)

@craigbox
Copy link
Contributor

/test doc.test.profile-default

@AritraDey-Dev
Copy link
Author

/retest

@AritraDey-Dev
Copy link
Author

AritraDey-Dev commented May 30, 2025

Our options here are (a) always sleep 1s, and hope that is enough (b) sleep something like 5s in every test where we install httpbin. I'd like feedback from other docs maintainers here.

As we're encountering edge cases where config propagation is still flaky, I guess we could consider adding an optional advanced wait mechanism in the future (e.g., polling Envoy config or using istioctl proxy-config).
For now, I'm adding a sleep 2s just to check the logs of this test. A simple sleep 1s might be sufficient, but this is for initial validation.

@AritraDey-Dev
Copy link
Author

/test doc.test.profile-default

@istio-testing
Copy link
Contributor

@AritraDey-Dev: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
doc.test.profile-default_istio.io 0cd3471 link true /test doc.test.profile-default

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test Set this label allow normal testing to take place for a PR not submitted by an Istio org member. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

remove use of istioctl x wait
5 participants