refactor: migrate Druid HTTP client from Netty 3 to 4#19567
Open
clintropolis wants to merge 4 commits into
Open
refactor: migrate Druid HTTP client from Netty 3 to 4#19567clintropolis wants to merge 4 commits into
clintropolis wants to merge 4 commits into
Conversation
changes:
* `NettyHttpClient` now uses netty 4 (along with all associated configs, handlers, etc)
* allocator: uses Netty 4.2's default `AdaptiveByteBufAllocator` (pooled internally, direct by default), replacing Netty 3's unpooled heap allocator. Operators running close to `-XX:MaxDirectMemorySize` may need to bump it. Allocator type is swappable at runtime via `-Dio.netty.allocator.type={adaptive,pooled,unpooled}` (operators might wish to experiment on upgrade).
* fix a leak in `SequenceInputStreamResponseHandler`, copy chunk bytes to `byte[]` before queueing (was wrapping a pooled `ByteBuf` that `SimpleChannelInboundHandler` releases after `channelRead0` returns)
* lots of import changes to swap to netty 4
* cleanup pom, licenses.yaml, OWASP suppressions
* added 'paranoid' leak detection to embedded-tests
| Assert.assertEquals( | ||
| json, | ||
| StringUtils.fromUtf8(ByteStreams.toByteArray(new ChannelBufferInputStream(request.getContent()))) | ||
| StringUtils.fromUtf8(ByteStreams.toByteArray(new ByteBufInputStream(request.getContent()))) |
| Assert.assertEquals( | ||
| "{\"foo\":3}", | ||
| StringUtils.fromUtf8(ByteStreams.toByteArray(new ChannelBufferInputStream(request.getContent()))) | ||
| StringUtils.fromUtf8(ByteStreams.toByteArray(new ByteBufInputStream(request.getContent()))) |
FrankChen021
reviewed
Jun 9, 2026
FrankChen021
left a comment
Member
There was a problem hiding this comment.
| Severity | Findings |
|---|---|
| P0 | 0 |
| P1 | 1 |
| P2 | 0 |
| P3 | 0 |
| Total | 1 |
Reviewed 164 of 164 changed files.
This is an automated review by Codex GPT-5.5
| method, | ||
| urlFile.isEmpty() ? "/" : urlFile | ||
| urlFile.isEmpty() ? "/" : urlFile, | ||
| request.hasContent() ? request.getContent() : Unpooled.EMPTY_BUFFER |
Member
There was a problem hiding this comment.
[P1] Do not hand the request-owned ByteBuf to Netty
This passes the Request's stored ByteBuf directly into a Netty 4 FullHttpRequest. Netty's HTTP encoder releases FullHttpMessages after encoding, and may also advance the content reader index, so after the first send request.getContent() can be released or consumed. Kerberos' 401 retry path calls request.copy() after the first response, which can then fail or resend an empty body for authenticated POSTs. Please build the outbound request from a retained duplicate/copy, or make Request store bytes and create a fresh ByteBuf per send.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Third time's a charm? Previous attempts #12032 and #14479 afaict both got stuck trying to resolve failures in the integration tests (the old docker based integration tests). I started fresh for this attempt, but the embedded-tests that have replaced our old frameworks made it a lot easier to dig in and determine the problem (and credit to prior attempts which were used as reference for comparison)
The root problem with prior attempts that resulted in the observed hanging is that Netty 4's
setAutoRead(true)is just a config flag. Netty 3'ssetReadable(true)also kicked an immediate read. For Druid's chunked responses where the server flushes the status line in one TCP write and the body in subsequent writes (e.g.QueryResource), the body bytes sit on the wire for ~5 minutes until connection close. Fix is a one-linectx.read()afterhandleResponseinNettyHttpClientwith a comment explaining the semantic difference.Allocator and direct memory
The HTTP client previously used Netty 3's
HeapChannelBufferFactorywhich is unpooled heap buffers. Netty 4.2's default isAdaptiveByteBufAllocator(pooled internally viaAdaptivePoolingAllocator, direct by default). This is a substantive change in memory characteristics:running close to their
-XX:MaxDirectMemorySizebudget may need to bump it.higher than under Netty 3 even when idle, normal pooled-allocator behavior.
This PR is using Netty 4.2's default allocator (
adaptive) rather than pinning to thePooledByteBufAllocatorthat was the Netty 4 default until 4.2 (or the unpooled heap behavior of Netty 3). Since Druid operators are coming from heap-unpooled (Netty 3), neither Netty 4 choice represents a "known good baseline"; both are new characteristics for this workload. However, using pooled off-heap buffers of some kind should offer a performance improvement so it seems worth the risk.The allocator can be swapped without a code change:
-Dio.netty.allocator.type=…adaptive(default in Netty 4.2)pooledPooledByteBufAllocatorunpooled-Dio.netty.noPreferDirect=trueto also stay on-heapOther useful knobs:
io.netty.noPreferDirectfalsetrue⇒ pooled but heap rather than directio.netty.allocator.maxOrderio.netty.maxDirectMemory-XX:MaxDirectMemorySizeio.netty.leakDetection.levelsimpledisabled/simple/advanced/paranoidOther stuff
InputStreamHolder.fromChannelBuffer(ByteBuf, long)now copies bytes synchronously instead of wrapping aByteBufInputStreamover a pooled buffer thatSimpleChannelInboundHandlerreleases afterchannelRead0returns. The prior PRs hit CodeQL leak warnings onSequenceInputStreamResponseHandler.java; that handler is now leak-safe too, and so areDirectDruidClientandDataServerResponseHandlerwhich all funnel through the same helper.