fix(tracing) [1/2]: re-enable HTTP keepalive in SGPAsyncTracingProcessor via lazy per-loop client#341
Closed
mohammadatallah-scale wants to merge 1 commit into
Closed
Conversation
…a lazy per-loop client The previous implementation built AsyncSGPClient with `httpx.Limits(max_keepalive_connections=0)` to dodge "bound to a different event loop" errors that surfaced under sync-ACP / per-request event loops. Disabling keepalive paid a fresh TCP+TLS handshake on every span event. This change moves client construction out of __init__ and lazy-inits it on the running event loop on first use, with automatic recreation when the loop changes (tracked via `_client_owned_at_loop`). With the client always bound to the loop using it, httpx defaults can be left alone, so connection pooling and keepalive work normally. Behavior preserved: - Per-event upsert_batch(items=[one]) is unchanged in this PR. Batching is intentionally left for a follow-up so it can be reviewed independently. - Externally injected clients (`processor.sgp_async_client = X`) are always respected and never replaced; `_client_owned_at_loop` stays None for those. - Tests for memory-leak / span-lifecycle behavior pass unchanged. New tests: - test_client_constructed_without_disabling_keepalive — verifies AsyncSGPClient is no longer constructed with a custom http_client that disables keepalive. - test_owned_client_recreated_after_loop_swap — verifies that an owned client is replaced when the running loop changes. - test_injected_client_preserved — verifies that an externally assigned client is never auto-replaced. This is part 1 of the OVE-2 split. Part 2 (batching) builds on this.
This was referenced May 4, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part 1 of 2 for OVE-2 (https://linear.app/scale-epd/issue/OVE-2). Followed by a separate PR adding batched flushing on top of this branch.
Bug
SGPAsyncTracingProcessorpreviously builtAsyncSGPClientin__init__withhttpx.Limits(max_keepalive_connections=0)to dodge "bound to a different event loop" errors that surface under sync-ACP / per-request event loops. Disabling keepalive paid a fresh TCP + TLS handshake on every span event.Fix
Move client construction out of
__init__and lazy-init it on the running loop on first use. Track the loop the processor-owned client was constructed on (_client_owned_at_loop); recreate the client when that loop is no longer current. With the client always bound to the loop using it, httpx defaults can stay, so connection pooling and keepalive work normally.Behavior preserved
upsert_batch(items=[one])is unchanged. Batching is intentionally not part of this PR so it can be reviewed independently in part 2.processor.sgp_async_client = X) are never replaced;_client_owned_at_loopstays None for those.New tests
test_client_constructed_without_disabling_keepalive—AsyncSGPClientis no longer constructed with a customhttp_clientthat disables keepalive.test_owned_client_recreated_after_loop_swap— owned client is replaced when the running loop changes.test_injected_client_preserved— externally assigned client is never auto-replaced.Test plan
pytest tests/lib/core/tracing/processors/test_sgp_tracing_processor.py(9/9 pass)pytest tests/lib/core/tracing/(16/16 pass)ruff checkcleanruff format --checkcleanpython -c "import agentex"succeedsRelated