Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms #1617

Closed
uvarajk opened this issue Apr 29, 2021 · 22 comments
Assignees
Labels
for/user-attention This issue needs user attention (feedback, rework, etc...) type/bug A general bug

Comments

@uvarajk
Copy link

uvarajk commented Apr 29, 2021

Versions :
Reactor Netty : reactor-netty-http-1.0.5.jar, Java : 11.0.8 (N/A) , Linux : 4.15.0-133-generic

We're receiving the SSL Handshake Timout Exception in our production environment.

Captured few details for the reference those are mentioned below :

In the Debug Log below are the default connection pool configurations
r.n.resources.PooledConnectionProvider : Creating a new [http] client pool [PoolFactory{evictionInterval=PT0S, leasingStrategy=fifo, maxConnections=500, maxIdleTime=-1, maxLifeTime=-1, metricsEnabled=false, pendingAcquireMaxCount=1000, pendingAcquireTimeout=45000}] for [/172.25.0.12:443]

We are able to see that the application is working fine for some time(4 to 5 hours) and slowly the responses are received late
Taken <<24442>> milliseconds to retrieve data from endpoint
Taken <<32875>> milliseconds to retrieve data from endpoint
Taken <<2532046>> milliseconds to retrieve data from endpoint

During the delayed responses following are the errors/exceptions
io.netty.handler.timeout.ReadTimeoutException: null
io.netty.handler.ssl.SslHandshakeTimeoutException: handshake timed out after 10000ms

We have configured the following timeout's for the HTTPClient
web-client.connect-timeout=5
web-client.read-timeout=5
web-client.use-connection-pooling=true
reactor.netty.ioWorkerCount=128
web-client.response-timeout=5

It would be a great help if we could know what's going wrong.

@uvarajk uvarajk added status/need-triage A new issue that still need to be evaluated as a whole type/bug A general bug labels Apr 29, 2021
@uvarajk
Copy link
Author

uvarajk commented Apr 29, 2021

@violetagg for your question #907 (comment)

The below code provide the preperties
HttpClient.create()
.wiretap("reactor.netty.http.client.HttpClient", logLevel, AdvancedByteBufFormat.TEXTUAL)
.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, connectionTimeout)
.responseTimeout(responseTimeout)
.doOnConnected(connection -> connection.addHandlerLast(new ReadTimeoutHandler(readTimeout, TimeUnit.MILLISECONDS)))
.followRedirect(true);

@violetagg
Copy link
Member

violetagg commented Apr 29, 2021

@uvarajk What's the idea of having ReadTimeoutHandler?
You have SSL handshake timeout and also response timeout.

@uvarajk
Copy link
Author

uvarajk commented Apr 29, 2021

We initially included ReadTimeoutHandler to check if it the time taken to read the response. As it did not work we have added ResponseTimeout.
Yes got Response Timeout and default SSL handshake timeout. Still why the request is waiting too long as I have mentioned the timetaken is milliseconds.

@violetagg
Copy link
Member

@uvarajk Without knowing the scenario and how the client is used is very hard to say what causes this.

@uvarajk
Copy link
Author

uvarajk commented Apr 29, 2021

@violetagg If I understand your question right, below is my explanation.
the flow is
Client Request --> Out Microservice Application --> Network Layer(NAT) --> Downstream System (we are using the netty client to establish connection to the downstream system through the NATed IP).
It's purely a Rest Service call using Spring Boot web client.

@violetagg
Copy link
Member

@uvarajk OK so you are using WebClient. Can you give some coding, how you use it?

@uvarajk
Copy link
Author

uvarajk commented Apr 29, 2021

@violetagg please find the attached code for reference and starts with the below get method

public Mono get(String endpoint, Map<String, String> headers, Class responseClass)

WebClientConnector.txt

@violetagg
Copy link
Member

@uvarajk You are creating WebClient always, why don't you share it? Is it possible that you overload the downstream system. Do you use blocking or non-blocking on the downstream system? Did you check the load on the downstream system?

@uvarajk
Copy link
Author

uvarajk commented Apr 29, 2021

@violetagg I think spring internally uses singleton for the webclient.

However, the concern is why the timeout's are not considered, although we have connection timeout, read timeout and response timeout configured. Whether the code needs correction for the timeout to be effective.
Really scratching the head :o)

@violetagg
Copy link
Member

@uvarajk can you add logging.level.reactor.netty=debug to application.properties and then provide here logs for some request where the responseTimeout is not considered. You can correlated using connection id i.e. id:68812135-1, L:/0:0:0:0:0:0:0:1:52348 - R:/0:0:0:0:0:0:0:1:52346

18:33:26.906 [reactor-http-nio-3] DEBUG reactor.netty.http.client.HttpClient - [id:68812135-1, L:/0:0:0:0:0:0:0:1:52348 - R:/0:0:0:0:0:0:0:1:52346] WRITE: 145B

@violetagg violetagg added for/user-attention This issue needs user attention (feedback, rework, etc...) and removed status/need-triage A new issue that still need to be evaluated as a whole labels Apr 30, 2021
@uvarajk
Copy link
Author

uvarajk commented Apr 30, 2021

@violetagg Sorry, it took a bit more time to fetch those logs since it is from production. This occurs only after 4 to 5 hours after deployment.
Please find the attached logs that I have captured enabling the debug mode for netty.
I could see that one transaction Taken <<83275>> milliseconds to retrieve the data. Provided the Start and End below for reference.
nettyDebugLogs.txt

2021-04-30 09:32:18.840 : [id:18009796, L:/10.200.61.65:42856 - R:172.25.0.12/172.25.0.12:443] Registering pool release on close event for channel
2021-04-30 09:33:40.189 : [id:18009796, L:/10.200.61.65:42856 - R:172.25.0.12/172.25.0.12:443] READ COMPLETE
2021-04-30 09:33:42.737 : [id:18009796, L:/10.200.61.65:42856 ! R:172.25.0.12/172.25.0.12:443] UNREGISTERED

@uvarajk
Copy link
Author

uvarajk commented May 3, 2021

@violetagg any clue why the timeout's are being ignored. Have you checked the Debug logs.

@violetagg violetagg removed the for/user-attention This issue needs user attention (feedback, rework, etc...) label May 4, 2021
@violetagg violetagg self-assigned this May 5, 2021
@violetagg
Copy link
Member

@uvarajk If you do not increase the I/O workers count reactor.netty.ioWorkerCount=128, do you observe issue with the timeout?

@violetagg violetagg added the for/user-attention This issue needs user attention (feedback, rework, etc...) label May 12, 2021
@uvarajk
Copy link
Author

uvarajk commented May 16, 2021

@violetagg Yes we see the issue, the timeout's are not considered even without the ioWorkerCount configuration.

@violetagg
Copy link
Member

@uvarajk Is it possible that you run blocking/log running operations in the event loop? For blocking operations you can use BlockHound in order to detect them.

@uvarajk
Copy link
Author

uvarajk commented May 21, 2021

@violetagg Thanks, will check on that as I've never come across the BlockHound. Will try to implement.
Is it that will work just by adding BlockHound.install(); in the SpringBoot main method.

@violetagg
Copy link
Member

Is it that will work just by adding BlockHound.install(); in the SpringBoot main method.

Yes this should be enough

@violetagg
Copy link
Member

@uvarajk Were you able to install BlockHound?

@violetagg
Copy link
Member

@uvarajk Let me close this, if you are able to provide the requested information we can reopen the issue.

@uvarajk
Copy link
Author

uvarajk commented Jun 7, 2021

@violetagg please keep it open. I've just got the approval to include the BlockHound

@violetagg violetagg reopened this Jun 7, 2021
@violetagg
Copy link
Member

@uvarajk Were you able to install BlockHound?

@violetagg
Copy link
Member

@uvarajk Let me close this, if you are able to provide the requested information we can reopen the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for/user-attention This issue needs user attention (feedback, rework, etc...) type/bug A general bug
Projects
None yet
Development

No branches or pull requests

2 participants