feat: replace hand-rolled LLM clients with pgedge-go-llm-lib (PR 1 of 4)#172
feat: replace hand-rolled LLM clients with pgedge-go-llm-lib (PR 1 of 4)#172dpage wants to merge 13 commits into
Conversation
First step of LLM library migration (PR 1 of 4). Pinned to a commit on main pending v0.1.0 release upstream. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds toLibMessages, toLibTools, and fromLibContent for translating
between our interface{}-content chat.Message form and the library's
typed llm.Message form. Pure functions with full unit coverage;
no consumers yet — wrapper wires them in next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the inline embedding.LogLLMRequestTrace / LogLLMResponseTrace calls that the old hand-rolled clients made; the new wrapper plugs this into llm.Options.HTTPClient when debug is on. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Don't discard a successful response when reading the response body for tracing fails. Log the body-read error and return the response with an empty body, preserving the caller's success path. - Add a test verifying nothing is logged when the log level is below Debug (the hot-path fast-bail behaviour). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces libClient, a library-backed implementation of LLMClient, and rewires NewAnthropicClient/NewOpenAIClient/NewOllamaClient to return it. The old hand-rolled clients are renamed *OLD and remain in place; they will be deleted along with their tests in subsequent tasks of this PR. NewOllamaClient signature changes from (LLMClient) to (LLMClient, error) because the library validates baseURL at construction. The four call sites (2 in client.go, 2 in llmproxy/proxy.go) are updated. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The named import of the anthropic provider package (for WithToolCaching) already triggers init(). The adjacent blank import is redundant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All three providers now go through pgedge-go-llm-lib via the libClient wrapper added in the previous commit. ~1500 lines removed. internal/chat/llm.go shrinks from ~1810 lines to ~278 lines; the LLMClient interface, types, system prompt, and constructors remain. Test cleanup follows in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Deletes tests that referenced the deleted hand-rolled clients (anthropicClient, openaiClient, ollamaClient) and the private extract*ErrorMessage helpers. Adds wrapper-level OpenAI and Ollama integration tests against httptest servers via Options.BaseURL. The remaining wrapper tests (TestLibClient_*, TestToLib*, TestFromLibContent_*, TestTracingRoundTripper_*) exercise the translation, system-prompt injection, read-only mode, debug token logging, and round-tripper tracing — the parts of the wrapper we own. Wire-level behaviour is the library's responsibility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (10)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughThis PR migrates the LLM provider implementations from hand-rolled HTTP clients to a shared library ( ChangesLLM Library Migration
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Up to standards ✅🟢 Issues
|
| Category | Results |
|---|---|
| Complexity | 4 medium |
🟢 Metrics 109 complexity · -28 duplication
Metric Results Complexity 109 Duplication -28
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
The pgedge-go-llm-lib dependency requires Go 1.26; go mod tidy
bumped our go directive accordingly, so CI workflows and Dockerfiles
need to match.
Also replace strings.Title (deprecated since Go 1.18, caught by the
CLI Client lint check) with a small provider-display map. The new
output is also more accurate ("OpenAI" vs "Openai").
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
.github/workflows/ci-cli-client.yml (1)
30-32:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate cache key to match Go 1.26.
The Go toolchain version was updated to 1.26 (Line 21), but the cache key still references
go-1.25. This mismatch will prevent cache hits and slow down CI builds.🔧 Proposed fix
- key: ${{ runner.os }}-go-1.25-${{ hashFiles('**/go.sum') }} + key: ${{ runner.os }}-go-1.26-${{ hashFiles('**/go.sum') }} restore-keys: | - ${{ runner.os }}-go-1.25- + ${{ runner.os }}-go-1.26-🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci-cli-client.yml around lines 30 - 32, The cache key strings in the GitHub Actions workflow (the "key:" and "restore-keys:" values currently containing "go-1.25") must be updated to match the Go toolchain version bump to 1.26; change occurrences of "go-1.25" to "go-1.26" so the cache key (key: ${{ runner.os }}-go-1.26-${{ hashFiles('**/go.sum') }}) and restore-keys ( ${{ runner.os }}-go-1.26-) align with the updated Go version..github/workflows/ci-server.yml (1)
48-50:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate cache key to match Go 1.26.
The Go toolchain version was updated to 1.26 (Line 39), but the cache key still references
go-1.25. This mismatch will prevent cache hits and slow down CI builds.🔧 Proposed fix
- key: ${{ runner.os }}-go-1.25-${{ hashFiles('**/go.sum') }} + key: ${{ runner.os }}-go-1.26-${{ hashFiles('**/go.sum') }} restore-keys: | - ${{ runner.os }}-go-1.25- + ${{ runner.os }}-go-1.26-🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/ci-server.yml around lines 48 - 50, The cache key strings still reference go-1.25 causing cache misses after the toolchain bump; update the 'key:' and matching 'restore-keys:' entries that contain "${{ runner.os }}-go-1.25-${{ hashFiles('**/go.sum') }}" and "${{ runner.os }}-go-1.25-" to use go-1.26 instead so the cache key matches the updated Go version.
🧹 Nitpick comments (2)
internal/chat/llm_test.go (2)
51-53: 💤 Low valueConsider using
len(gotReqBody) == 0for byte slice empty check.Converting to string works but is less idiomatic than checking length directly.
♻️ More idiomatic empty check
- if string(gotReqBody) == "" { - t.Fatal("server received no request body") - } + if len(gotReqBody) == 0 { + t.Fatal("server received no request body") + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/chat/llm_test.go` around lines 51 - 53, Replace the string conversion empty check with a length check: instead of comparing string(gotReqBody) == "" in the test (in internal/chat/llm_test.go where gotReqBody is a []byte), use len(gotReqBody) == 0 to more idiomatically detect an empty byte slice; update the t.Fatal condition accordingly.
274-336: ⚖️ Poor tradeoffConsider comprehensive test coverage for OpenAI and Ollama.
The Anthropic tests validate system prompt injection, read-only mode, tool use translation, token usage behavior, and error handling. OpenAI and Ollama currently have only basic round-trip tests. If the shared
libClientimplementation is sufficiently validated via Anthropic tests and translation is covered inllm_translate_test.go, this may be acceptable. Otherwise, consider adding parallel tests for OpenAI and Ollama to ensure provider-specific wire formats work correctly across all scenarios.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/chat/llm_test.go` around lines 274 - 336, Tests TestLibClient_OpenAI_Chat_RoundTrip and TestLibClient_Ollama_Chat_RoundTrip only exercise basic round-trip behavior; add parallel provider-specific tests (mirroring the Anthropic cases) to validate system prompt injection, read-only mode, tool-use translation, token usage reporting and error handling for OpenAI and Ollama. Locate the test helpers and core logic in libClient (and translation behavior in llm_translate_test.go) and extend or duplicate the Anthropic test scenarios using the NewOpenAIClient and NewOllamaClient constructors and their Chat(...) calls (e.g., simulate provider wire formats, streaming vs non-streaming responses, errors and token/usage fields) so each provider’s request/response shaping is covered.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/chat/llm_tracing.go`:
- Around line 85-94: The tracing HTTP client created by newTracingHTTPClient
lacks a Timeout, so debug LLM requests can hang if the request context has no
deadline; update newTracingHTTPClient to set a sensible Timeout (e.g.,
30*time.Second) on the returned *http.Client, and ensure the time package is
imported; keep the existing Transport (tracingRoundTripper with provider, model,
inner, out) and only add the Timeout field to the client.
---
Outside diff comments:
In @.github/workflows/ci-cli-client.yml:
- Around line 30-32: The cache key strings in the GitHub Actions workflow (the
"key:" and "restore-keys:" values currently containing "go-1.25") must be
updated to match the Go toolchain version bump to 1.26; change occurrences of
"go-1.25" to "go-1.26" so the cache key (key: ${{ runner.os }}-go-1.26-${{
hashFiles('**/go.sum') }}) and restore-keys ( ${{ runner.os }}-go-1.26-) align
with the updated Go version.
In @.github/workflows/ci-server.yml:
- Around line 48-50: The cache key strings still reference go-1.25 causing cache
misses after the toolchain bump; update the 'key:' and matching 'restore-keys:'
entries that contain "${{ runner.os }}-go-1.25-${{ hashFiles('**/go.sum') }}"
and "${{ runner.os }}-go-1.25-" to use go-1.26 instead so the cache key matches
the updated Go version.
---
Nitpick comments:
In `@internal/chat/llm_test.go`:
- Around line 51-53: Replace the string conversion empty check with a length
check: instead of comparing string(gotReqBody) == "" in the test (in
internal/chat/llm_test.go where gotReqBody is a []byte), use len(gotReqBody) ==
0 to more idiomatically detect an empty byte slice; update the t.Fatal condition
accordingly.
- Around line 274-336: Tests TestLibClient_OpenAI_Chat_RoundTrip and
TestLibClient_Ollama_Chat_RoundTrip only exercise basic round-trip behavior; add
parallel provider-specific tests (mirroring the Anthropic cases) to validate
system prompt injection, read-only mode, tool-use translation, token usage
reporting and error handling for OpenAI and Ollama. Locate the test helpers and
core logic in libClient (and translation behavior in llm_translate_test.go) and
extend or duplicate the Anthropic test scenarios using the NewOpenAIClient and
NewOllamaClient constructors and their Chat(...) calls (e.g., simulate provider
wire formats, streaming vs non-streaming responses, errors and token/usage
fields) so each provider’s request/response shaping is covered.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5dc67c60-8d5d-4955-a2e8-3df4ec8223f4
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (15)
.github/workflows/ci-cli-client.yml.github/workflows/ci-server.yml.github/workflows/release.ymlDockerfile.cliDockerfile.serverdocs/changelog.mdgo.modinternal/chat/client.gointernal/chat/llm.gointernal/chat/llm_test.gointernal/chat/llm_tracing.gointernal/chat/llm_tracing_test.gointernal/chat/llm_translate.gointernal/chat/llm_translate_test.gointernal/llmproxy/proxy.go
- Update GitHub Actions cache keys from go-1.25 to go-1.26 to match the toolchain bump, restoring cache hits on CI. - Add a 120-second Timeout to the tracing http.Client so debug-mode LLM requests can't hang indefinitely if the context has no deadline. - Switch a test's empty-body check from string(b)=="" to len(b)==0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Stacked PRs in the LLM library migration (PR 2/3/4 of 4) target the previous PR's branch rather than main. Without this, GitHub Actions workflows don't fire on those PRs, leaving them effectively un-CI'd until they're rebased onto main. Widening the trigger list lets the stacked PRs see real CI from the start. This trigger pattern only matches branches that start with 'feat/llm-lib-' — narrow enough that it doesn't broaden CI for unrelated work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream resolved the two known issues filed during PR 1's setup: - pgedge-go-llm-lib#13: anthropic.WithSystemCaching is now available. The libClient wrapper applies it on every Anthropic request, so long system prompts no longer pay full input-token cost each turn. (Tool caching is still applied separately when tools are present.) - pgedge-go-llm-lib#12: OpenAI's Responses API is now auto-selected for gpt-5-*, o1-*, and o3-* models. No wrapper changes are needed — the library routes them to /v1/responses transparently. Library SHA bumped from 23f6ef29ba95 to 17b5f11c6044. Changelog "Known Issues" block removed (both issues are no longer issues) and the two new behaviours called out under Changed. Adds a wrapper-level test confirming the cache_control marker reaches the last system block on the Anthropic wire. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Summary
internal/chat/llm.gowith thin wrappers around pgedge-go-llm-lib.chat.LLMClientinterface preserved; call sites ininternal/chat/client.goandinternal/llmproxy/proxy.goare essentially unchanged (onlyNewOllamaClientgained anerrorreturn).This is PR 1 of 4 in the LLM library migration. Spec lives in
.claude/specs/2026-05-27-llm-library-migration-design.md(gitignored / local-only).Follow-up PRs:
internal/embedding/).Known issues (filed upstream)
Test plan
make testpasses locally with no skips.claude-sonnet-4-20250514) returns expected response with correct token usage.gpt-4o-mini) returns expected response.🤖 Generated with Claude Code
Summary by CodeRabbit
Chores
Refactor
Bug Fixes
Documentation
Tests