Skip to content

feat: unified chat handler#92

Merged
bzp2010 merged 1 commit into
mainfrom
bzp/feat-unified-chat-handler
May 6, 2026
Merged

feat: unified chat handler#92
bzp2010 merged 1 commit into
mainfrom
bzp/feat-unified-chat-handler

Conversation

@bzp2010
Copy link
Copy Markdown
Collaborator

@bzp2010 bzp2010 commented May 6, 2026

Summary by CodeRabbit

  • Refactor
    • Restructured proxy handler architecture to use an adapter-based pattern for improved code organization and maintainability.
    • Consolidated handler implementations (chat completions, messages, responses) under a unified framework.
    • Enhanced internal visibility controls for better encapsulation across handler modules.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 6, 2026

📝 Walkthrough

Walkthrough

This PR refactors the proxy handlers from direct Axum functions to a trait-based adapter pattern. A new FormatHandlerAdapter trait is introduced alongside a generic format_handler function that orchestrates request processing, streaming, observability, and rate-limiting. Three adapters (ChatCompletionsAdapter, MessagesAdapter, ResponsesAdapter) implement the trait for different chat formats. Supporting infrastructure is updated—span properties are re-exported from a cleaner telemetry module, and StreamOutputCollector visibility is broadened to pub(crate) across modules. The router is rewired to dispatch through the generic handler with the appropriate adapter.


Changes

Adapter-Based Handler Refactoring

Layer / File(s) Summary
Trait Definition
src/proxy/handlers/format_handler.rs
FormatHandlerAdapter trait introduced with associated types (Format, Request, Response, StreamChunk, Error, Collector) and methods for span handling, chunk processing, serialization, and streaming behavior. Core format_handler function implements observability, authorization, rate-limiting, model resolution, and dispatch to regular or streaming response handlers. Helper functions for streaming usage observation, finalization, and SSE-based streaming response construction are added.
Adapter Implementations
src/proxy/handlers/chat_completions/mod.rs, src/proxy/handlers/messages/mod.rs, src/proxy/handlers/responses/mod.rs
Three adapters implementing FormatHandlerAdapter: ChatCompletionsAdapter for OpenAI chat format, MessagesAdapter for Anthropic messages, ResponsesAdapter for responses API format. Each adapter encapsulates model resolution, span properties, chunk span property application (first item applies all properties; subsequent items filter for finish/reason keys), stream item recording, and SSE serialization. Previous direct handlers and monolithic SSE helpers are removed.
Stream Output & Span Properties
src/proxy/handlers/chat_completions/span_attributes/*, src/proxy/handlers/messages/span_attributes/*, src/proxy/handlers/responses/span_attributes/*
StreamOutputCollector visibility elevated from module-scoped to pub(crate) across all three format modules. A new inherent method output_message_span_properties() is added to StreamOutputCollector to compute span properties from collected views. Re-exports updated to use telemetry module exports (chunk_span_properties, request_span_properties, response_span_properties) instead of deep crate paths. event_starts_output added to responses re-exports.
Module Declaration & Router Wiring
src/proxy/handlers/mod.rs, src/proxy/mod.rs
format_handler module is declared in handlers. Router endpoints for chat completions, messages, and responses are refactored to dispatch through format_handler with their respective adapters. The 32 MB body limit layer is moved to the messages route. Existing embeddings, authentication, and trace layers remain intact.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes


Possibly related PRs

  • api7/aisix#35: Replaces HookContext/hook-manager lifecycle with RequestContext + direct hook function calls, directly related to the request context flow introduced in the format_handler.
  • api7/aisix#39: Refactors Anthropic messages handler into a FormatHandlerAdapter, directly modifying the same MessagesAdapter code introduced in this PR.
  • api7/aisix#65: Modifies chat_completions and messages span-attributes surface (StreamOutputCollector, telemetry span-property helpers), directly touching the same utilities reorganized in this PR.

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Security Check ❌ Error Messages and Responses adapters expose raw upstream provider error responses to clients via SSE. GatewayError.to_string() includes full provider response body with potentially sensitive data. Sanitize error messages before SSE serialization. Extract status/provider from GatewayError::Provider, redact the body field. Implement custom error handling that doesn't expose raw upstream responses.
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: unified chat handler' accurately summarizes the primary change: refactoring the chat handlers (chat_completions, messages, responses) to use a unified adapter-based pattern (FormatHandlerAdapter).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
E2e Test Quality Review ✅ Passed E2E test coverage complete: 67 integration tests covering all endpoints. Error handling correct. No unsafe patterns or concurrency issues.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bzp/feat-unified-chat-handler

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/proxy/handlers/responses/mod.rs (1)

89-91: 💤 Low value

The serialize_stream_item override is redundant.

The implementation is identical to the default provided by FormatHandlerAdapter. Since Self::Format = ResponsesApiFormat, the default would produce the same result.

♻️ Remove redundant override
-    fn serialize_stream_item(chunk: &Self::StreamChunk) -> SseEvent {
-        serialize_stream_event(chunk)
-    }
-
     fn stream_error_event(error: &GatewayError) -> Option<SseEvent> {

The serialize_stream_event helper can also be removed if only used here, or kept if used elsewhere.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/proxy/handlers/responses/mod.rs` around lines 89 - 91, Remove the
redundant override of serialize_stream_item in the impl for FormatHandlerAdapter
(where Self::Format = ResponsesApiFormat) — delete the fn
serialize_stream_item(chunk: &Self::StreamChunk) -> SseEvent {
serialize_stream_event(chunk) } so the default implementation from
FormatHandlerAdapter is used; if the helper function serialize_stream_event is
only referenced by that removed method, remove serialize_stream_event as well,
otherwise leave it in place for other callers.
src/proxy/handlers/format_handler.rs (1)

255-265: 💤 Low value

Consider extracting stream state into a named struct for readability.

The 8-element tuple state in unfold is functional but makes the code harder to follow. A named struct would improve clarity, though this is optional given the localized scope.

♻️ Optional refactor to improve readability
struct StreamState<A: FormatHandlerAdapter> {
    stream: ChatResponseStream<AdapterFormat<A>>,
    span: Span,
    request_ctx: RequestContext,
    should_terminate: bool,
    saw_item: bool,
    usage_rx: Option<oneshot::Receiver<Usage>>,
    output_collector: AdapterCollector<A>,
    first_output_arrived: bool,
}

This would make the unfold closure parameters and return values more self-documenting.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/proxy/handlers/format_handler.rs` around lines 255 - 265, The 8-tuple
used as the state parameter to futures::stream::unfold is hard to read;
introduce a small named struct (e.g., StreamState<A: FormatHandlerAdapter>) that
contains the fields currently in the tuple (stream:
ChatResponseStream<AdapterFormat<A>>, span: Span, request_ctx: RequestContext,
should_terminate: bool, saw_item: bool, usage_rx:
Option<oneshot::Receiver<Usage>>, output_collector: AdapterCollector<A>,
first_output_arrived: bool) and replace the tuple in the unfold call and closure
signature with that struct to make the code self-documenting and improve
maintainability.
src/proxy/handlers/messages/mod.rs (1)

105-107: 💤 Low value

The serialize_stream_item override is redundant.

Same as in ResponsesAdapter, this implementation duplicates the default behavior since Self::Format = AnthropicMessagesFormat.

♻️ Remove redundant override
-    fn serialize_stream_item(chunk: &Self::StreamChunk) -> SseEvent {
-        serialize_stream_event(chunk)
-    }
-
     fn stream_error_event(error: &GatewayError) -> Option<SseEvent> {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/proxy/handlers/messages/mod.rs` around lines 105 - 107, The method
override serialize_stream_item in the impl for MessagesAdapter is redundant
because it simply calls serialize_stream_event and Self::Format is
AnthropicMessagesFormat (same default behavior as ResponsesAdapter); remove the
serialize_stream_item function from the impl so the default implementation is
used, leaving serialize_stream_event and the impl unchanged otherwise to avoid
duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@src/proxy/handlers/format_handler.rs`:
- Around line 255-265: The 8-tuple used as the state parameter to
futures::stream::unfold is hard to read; introduce a small named struct (e.g.,
StreamState<A: FormatHandlerAdapter>) that contains the fields currently in the
tuple (stream: ChatResponseStream<AdapterFormat<A>>, span: Span, request_ctx:
RequestContext, should_terminate: bool, saw_item: bool, usage_rx:
Option<oneshot::Receiver<Usage>>, output_collector: AdapterCollector<A>,
first_output_arrived: bool) and replace the tuple in the unfold call and closure
signature with that struct to make the code self-documenting and improve
maintainability.

In `@src/proxy/handlers/messages/mod.rs`:
- Around line 105-107: The method override serialize_stream_item in the impl for
MessagesAdapter is redundant because it simply calls serialize_stream_event and
Self::Format is AnthropicMessagesFormat (same default behavior as
ResponsesAdapter); remove the serialize_stream_item function from the impl so
the default implementation is used, leaving serialize_stream_event and the impl
unchanged otherwise to avoid duplication.

In `@src/proxy/handlers/responses/mod.rs`:
- Around line 89-91: Remove the redundant override of serialize_stream_item in
the impl for FormatHandlerAdapter (where Self::Format = ResponsesApiFormat) —
delete the fn serialize_stream_item(chunk: &Self::StreamChunk) -> SseEvent {
serialize_stream_event(chunk) } so the default implementation from
FormatHandlerAdapter is used; if the helper function serialize_stream_event is
only referenced by that removed method, remove serialize_stream_event as well,
otherwise leave it in place for other callers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2ba95f1f-e200-4d36-914b-e3c7af5a9036

📥 Commits

Reviewing files that changed from the base of the PR and between eaaed3e and c7b7a8e.

📒 Files selected for processing (12)
  • src/proxy/handlers/chat_completions/mod.rs
  • src/proxy/handlers/chat_completions/span_attributes/mod.rs
  • src/proxy/handlers/chat_completions/span_attributes/stream_output.rs
  • src/proxy/handlers/format_handler.rs
  • src/proxy/handlers/messages/mod.rs
  • src/proxy/handlers/messages/span_attributes/mod.rs
  • src/proxy/handlers/messages/span_attributes/stream_output.rs
  • src/proxy/handlers/mod.rs
  • src/proxy/handlers/responses/mod.rs
  • src/proxy/handlers/responses/span_attributes/mod.rs
  • src/proxy/handlers/responses/span_attributes/stream_output.rs
  • src/proxy/mod.rs
💤 Files with no reviewable changes (3)
  • src/proxy/handlers/messages/span_attributes/mod.rs
  • src/proxy/handlers/responses/span_attributes/mod.rs
  • src/proxy/handlers/chat_completions/span_attributes/mod.rs

@bzp2010 bzp2010 merged commit db7317b into main May 6, 2026
3 checks passed
@bzp2010 bzp2010 deleted the bzp/feat-unified-chat-handler branch May 6, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant