Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the minimun connection time out for curl call #292

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

marcemq
Copy link
Contributor

@marcemq marcemq commented Dec 11, 2019

The time to pod network test harness was hanging when the net server was
not ready due to the curl execution because curl has its own connection
time out and it can be up to two minutes. Now, such connection time out is set
to one second.

The time to pod network test harness was hanging when the net server was
not ready due to the `curl` execution because `curl` has its own connection
time out and it can be up to two minutes.

Signed-off-by: Morales Quispe, Marcela <[email protected]>
Copy link

@grahamwhaley grahamwhaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
I'm expecting 1s is enough for most network setups we will test on - but, let's watch out for that ;-)

@askervin
Copy link
Contributor

nc measurements, which do pretty much the same without http, have recorded > 1 s timeouts when running > 400 pods with Canal CNI. After 700 pods, already more than 5 % of round trip times (including tcp/ip connect-send-recv-close like here as well) have exceeded 1 s.

In case timeout leads to a test failure, I'd vote for a longer timeout. On the other hand, if it will not fail the test, we'll just get faster test execution and prettier graph than in nc tests because of not having to scale y axis to 8000 ms, for instance. :)

@askervin
Copy link
Contributor

askervin commented Dec 16, 2019

@marcemq, I tested this by causing timeouts to the curl request: I rebooted the node where net-serve pod was running, and alternatively suspended the agnhost process (kill -STOP $(pidof agnhost)).

I'd suggest adding --max-time 1 after --connect-timeout 1, because otherwise curl was still able to hang.

I'd also suggest adding net-serve* service cleanups in the cleanup() routine. Currently services can be left behind after failed retries due to curl timeouts, because set -e forces exit when waitForProcess fails. Services that are left behind will fail the next test run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants