Skip to content

feat: remove chat.LLMClient wrapper and use pgedge-go-llm-lib directly (PR 5 of 5)#176

Draft
dpage wants to merge 2 commits into
feat/llm-lib-pr4-streaming-geminifrom
feat/llm-lib-pr5-cleanup
Draft

feat: remove chat.LLMClient wrapper and use pgedge-go-llm-lib directly (PR 5 of 5)#176
dpage wants to merge 2 commits into
feat/llm-lib-pr4-streaming-geminifrom
feat/llm-lib-pr5-cleanup

Conversation

@dpage

@dpage dpage commented May 28, 2026

Copy link
Copy Markdown
Member

Summary

Final clean-up PR for the LLM library migration. Removes the temporary chat.LLMClient wrapper introduced in PR 1 (#172) and rewires the CLI chat client to use pgedge-go-llm-lib's llm.Client directly.

  • internal/chat/llm.go (the LLMClient interface, libClient wrapper, NewAnthropicClient / NewOpenAIClient / NewOllamaClient constructors, and helper functions) — deleted.
  • internal/chat/llm_translate.go (the toLibMessages / toLibTools / fromLibContent translation helpers from PR 1) — deleted.
  • internal/chat/llm_test.go and llm_translate_test.go — deleted.
  • New internal/chat/prompts.go hosts the system prompt constants and a buildSystemPrompt(readOnly bool) helper.
  • internal/chat/client.go and commands.go now use llm.Message, llm.ContentBlock, llm.ChatRequest, etc. directly. Anthropic prompt caching (system + tools) is applied inline via anthropic.WithSystemCaching / WithToolCaching.
  • internal/chat/conversations.go reads/writes []llm.Message. A Conversation.UnmarshalJSON shim migrates pre-PR-5 saved conversations (string content becomes a [{type: "text", text: "..."}] block; legacy user-role tool_result messages are promoted to RoleTool).

Net: -706 lines across the chat package.

This is PR 5 of 5. Based on PR 4 (#175); will need rebase onto main as PR 1-4 land.

Behaviour preserved

  • Read-only mode still appends the read-only safety prompt (via buildSystemPrompt(true)).
  • Anthropic prompt caching for system + tools still applied.
  • Debug-mode request/response tracing still routes through llm_tracing.go's round-tripper.
  • Model auto-selection, family-match preference logic, conversation save/load still work.
  • The CLI's external surface (chat.NewClient, chat.LoadConfig, chat.ConfigOverrides, chat.ClientVersion) is unchanged.

Test plan

  • make test passes (all packages green, including the chat package's rewritten tests + two new migration tests).
  • make lint reports 0 issues.
  • go run ./cmd/pgedge-pg-mcp-cli -help runs cleanly.
  • Browser/CLI manual smoke deferred to verification before merging.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Refactor

    • Chat client internals rewritten to use a unified LLM integration, improving model selection, token handling, and tool invocation.
  • New Features

    • Automatic migration of saved conversations to a structured format; legacy plain-text messages and tool results are normalized and roles corrected.
    • Debug HTTP tracing (when enabled) is preserved for troubleshooting.

Review Change Stack

Delete the chat.LLMClient interface, libClient wrapper, and the
translation helpers (toLibMessages, toLibTools, fromLibContent) that
were introduced as a temporary shim during the LLM-library migration.

The chat package now consumes llm.Client, llm.Message, llm.ContentBlock,
llm.ChatRequest, llm.ChatResponse, llm.Tool, and llm.TokenUsage
directly throughout. The client.go Client struct holds the LLM
behind a small package-local chatLLM interface (Chat + ListModels)
which the real library client satisfies structurally — this keeps
test mocks straightforward without re-introducing a wrapper layer
around the rest of the library surface (Embed, Rerank, ChatStream,
etc.). The Anthropic prompt-cache markers (system + tools) are now
applied directly in processQuery via anthropic.WithSystemCaching and
anthropic.WithToolCaching.

The chat system prompt constants move to a new prompts.go file
with a buildSystemPrompt(readOnly) helper.

Conversations saved by earlier versions are migrated transparently
on load: messages with a plain-string content field are wrapped as
a single text block, and tool-result messages saved with role
"user" are promoted to role "tool" to match the library shape.
The on-disk format written going forward is the typed content-block
form. A Conversation.UnmarshalJSON method and two new unit tests
cover the legacy-format paths.

llm_loglevel.go and llm_tracing.go remain — the CLI's debug mode
still wires the tracingRoundTripper into llm.Options.HTTPClient.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dpage

dpage commented May 28, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b4ef8455-b090-4a1b-92c2-66fc8533d079

📥 Commits

Reviewing files that changed from the base of the PR and between f234a90 and 879d1ea.

📒 Files selected for processing (4)
  • docs/changelog.md
  • internal/chat/client.go
  • internal/chat/client_test.go
  • internal/chat/commands.go
✅ Files skipped from review due to trivial changes (1)
  • docs/changelog.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • internal/chat/client_test.go
  • internal/chat/commands.go
  • internal/chat/client.go

Walkthrough

The PR removes the internal llm.go abstraction and migrates the chat client to use pgedge-go-llm-lib types directly. Conversations gain JSON migration for legacy saved formats, system prompts move to a new helper, and the client’s compaction, initialization, and agentic tool loop are refactored around llmlib message/content-block types.

Changes

LLM type migration and client refactoring

Layer / File(s) Summary
Integration tests with mock LLM client
internal/chat/client_integration_test.go
Replace mock with mockLLMClient returning *llmlib.ChatResponse; tests updated to use llmlib content blocks, stop reasons, and role/text extraction.
Conversation data model and JSON migration
internal/chat/conversations.go
Conversation stores []llmlib.Message and adds Username, Provider, Model; UnmarshalJSON migrates legacy plain-string content into typed text blocks and promotes legacy tool-result messages to RoleTool.
Conversation API and migration tests
internal/chat/conversations_test.go
Tests updated to use llmlib.Message; new regression tests validate legacy string-to-block migration and legacy tool-result promotion.
Changelog and prompts helper
docs/changelog.md, internal/chat/prompts.go
Add changelog bullets describing llmlib client switch and conversation-load migration; add buildSystemPrompt and read-only safety prompt constants.
Client struct, LLM initialization, and tool caching
internal/chat/client.go (lines 29–496)
Client fields change to chatLLM, []llmlib.Message, cached []llmlib.Tool; initializeLLM validates provider, probes models with a temp client, selects model, and constructs production llmlib.Client; add setTools/mcpToolsToLibTools.
Message compaction and token estimation
internal/chat/client.go (lines 617–929)
Compaction and public compaction structs updated to []llmlib.Message; token estimation iterates llmlib content blocks and preserves tool-use/tool-result adjacency and the first user message.
Tool execution and agentic message loop
internal/chat/client.go (lines 949–1167)
Tool loop rewritten around llmlib stop reasons and content blocks; assistant messages appended with raw response content; ToolUse blocks decoded and executed via mcp.CallTool; tool results appended as RoleTool ToolResultBlocks; certain successful tool outcomes refresh cached tools.
Chat commands, history load, and replay
internal/chat/commands.go
/prompt appends llmlib.UserText/AssistantText; refreshCapabilities uses setTools; /history load copies migrated messages into c.messages, recomputes DB writeability, and replays using extractTextFromContent and llmlib role constants.
Unit tests for token/compaction/message handling
internal/chat/client_test.go
Token estimation, has-tool-results, adjust-start-for-tool-pairs, local compaction, and tool-content token estimation tests refactored to llmlib message/content-blocks.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • pgEdge/pgedge-postgres-mcp#38: Tool-execution refactoring in the agentic loop is closely coupled; PR #38 introduces select_database_connection/list_database_connections tools which complement the tool refresh behavior.

Suggested reviewers

  • tsivaprasad
  • mmols
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the primary change: removal of the internal chat.LLMClient wrapper and direct use of pgedge-go-llm-lib types. This is the central focus of the changeset.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/llm-lib-pr5-cleanup

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@codacy-production

codacy-production Bot commented May 28, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 8 medium

Results:
8 new issues

Category Results
Complexity 8 medium

View in Codacy

🟢 Metrics 70 complexity · -5 duplication

Metric Results
Complexity 70
Duplication -5

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/chat/client_test.go (1)

520-539: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Tighten token assertions so they validate tool payload accounting.

These checks can pass from message overhead alone (+10), so they don’t prove tool_result/tool_use content is being counted.

Suggested test fix
-	if tokens < 10 {
+	if tokens <= 10 {
 		t.Errorf("Expected at least 10 tokens, got %d", tokens)
 	}
@@
-	if estimateTotalTokens(withToolUse) <= 0 {
+	if estimateTotalTokens(withToolUse) <= 10 {
 		t.Errorf("Expected non-zero tokens for tool_use input")
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/chat/client_test.go` around lines 520 - 539, The current test uses a
fixed threshold that can be satisfied by message overhead alone; instead compute
a baseline token count (using estimateTotalTokens on the original messages
variable or a minimal message) and then assert that
estimateTotalTokens(withToolUse) is strictly greater than that baseline to prove
the tool input is counted; use the same approach for any tool_result cases and
reference the estimateTotalTokens function and the withToolUse variable (and
tokens baseline) so the test validates an increase (optionally at least the JSON
input length) rather than a fixed minimum.
internal/chat/commands.go (1)

1070-1106: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Refresh currentDBWritable when loading a saved connection.

This path can switch the active database, but it leaves c.currentDBWritable at the previous session's value. The next request then uses stale safety state: Line 956 in internal/chat/client.go builds the wrong system prompt, and Line 1019 can skip write confirmation on a writable database.

Suggested fix
 	if conv.Connection != "" {
 		if _, current, err := c.mcp.ListDatabases(ctx); err == nil {
 			if current != conv.Connection {
 				if err := c.mcp.SelectDatabase(ctx, conv.Connection); err != nil {
 					c.ui.PrintError(fmt.Sprintf("Warning: Failed to restore database connection: %v", err))
 				} else {
 					// Refresh capabilities after database change
 					if err := c.refreshCapabilities(ctx); err != nil {
 						c.ui.PrintError(fmt.Sprintf("Warning: Failed to refresh capabilities: %v", err))
 					}
 				}
 			}
 		}
+		c.currentDBWritable = false
+		if databases, current, err := c.mcp.ListDatabases(ctx); err == nil {
+			for _, db := range databases {
+				if db.Name == current && db.AllowWrites {
+					c.currentDBWritable = true
+					break
+				}
+			}
+		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/chat/commands.go` around lines 1070 - 1106, When restoring a saved
connection in the conversation load path, the code calls c.mcp.SelectDatabase
and c.refreshCapabilities but does not update c.currentDBWritable, leaving stale
writable state; after a successful SelectDatabase (and preferably after
refreshCapabilities) set c.currentDBWritable to the correct writable state by
querying the MCP or using the refreshed capability result so that the client’s
writable flag (c.currentDBWritable) reflects the newly selected database and
prevents incorrect prompt construction in client.go (c.currentDBWritable used at
lines building system prompt and write-confirm logic). Ensure the update occurs
in the same branch where SelectDatabase succeeds and handle any errors
consistently with existing ui.PrintError behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/changelog.md`:
- Around line 82-89: Update the Unreleased notes so the two bullets consistently
state that the internal chat.LLMClient wrapper (internal/chat/llm.go and
internal/chat/llm_translate.go) was removed and the code now uses llm.Client
directly, while clarifying that the external llm.Client interface (not the
deleted chat wrapper) remains unchanged; replace the ambiguous “the LLMClient
interface is unchanged” sentence with wording that explicitly distinguishes the
removed chat.LLMClient wrapper from the unchanged llm.Client API and keep the
reference to llm.Options.HTTPClient tracing.

In `@internal/chat/client.go`:
- Around line 1007-1012: The code currently swallows json.Unmarshal errors and
proceeds to call the tool with an empty input; change the flow so when
json.Unmarshal(toolUse.Input, &input) returns an error you do NOT proceed to
invoke the tool. Specifically, in the block around json.Unmarshal and
c.ui.PrintToolExecution: check the error from json.Unmarshal, surface it to the
UI (e.g., via c.ui.PrintToolExecution or an error-specific UI method) and
return/abort the tool execution instead of calling the MCP tool; additionally,
for tools like query_database validate that the parsed input contains the
required "query" field before invoking the tool and abort with a clear UI error
if missing or if JSON was malformed.

---

Outside diff comments:
In `@internal/chat/client_test.go`:
- Around line 520-539: The current test uses a fixed threshold that can be
satisfied by message overhead alone; instead compute a baseline token count
(using estimateTotalTokens on the original messages variable or a minimal
message) and then assert that estimateTotalTokens(withToolUse) is strictly
greater than that baseline to prove the tool input is counted; use the same
approach for any tool_result cases and reference the estimateTotalTokens
function and the withToolUse variable (and tokens baseline) so the test
validates an increase (optionally at least the JSON input length) rather than a
fixed minimum.

In `@internal/chat/commands.go`:
- Around line 1070-1106: When restoring a saved connection in the conversation
load path, the code calls c.mcp.SelectDatabase and c.refreshCapabilities but
does not update c.currentDBWritable, leaving stale writable state; after a
successful SelectDatabase (and preferably after refreshCapabilities) set
c.currentDBWritable to the correct writable state by querying the MCP or using
the refreshed capability result so that the client’s writable flag
(c.currentDBWritable) reflects the newly selected database and prevents
incorrect prompt construction in client.go (c.currentDBWritable used at lines
building system prompt and write-confirm logic). Ensure the update occurs in the
same branch where SelectDatabase succeeds and handle any errors consistently
with existing ui.PrintError behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 4015f4f3-74be-4d48-a366-86b5f077453d

📥 Commits

Reviewing files that changed from the base of the PR and between 3f6070f and f234a90.

📒 Files selected for processing (12)
  • docs/changelog.md
  • internal/chat/client.go
  • internal/chat/client_integration_test.go
  • internal/chat/client_test.go
  • internal/chat/commands.go
  • internal/chat/conversations.go
  • internal/chat/conversations_test.go
  • internal/chat/llm.go
  • internal/chat/llm_test.go
  • internal/chat/llm_translate.go
  • internal/chat/llm_translate_test.go
  • internal/chat/prompts.go
💤 Files with no reviewable changes (4)
  • internal/chat/llm_translate_test.go
  • internal/chat/llm_test.go
  • internal/chat/llm.go
  • internal/chat/llm_translate.go

Comment thread docs/changelog.md Outdated
Comment thread internal/chat/client.go Outdated
- commands.go: refresh c.currentDBWritable after loading a saved
  conversation that switches database. Previously the writable flag
  was left at the previous session's value, causing the next
  request to build the wrong system prompt and potentially skip
  write confirmation on a writable database.
- client.go: fail the tool call when json.Unmarshal of toolUse.Input
  errors out, rather than silently invoking the tool with an empty
  input map. Returns the parse error to the model as a tool_result
  with is_error=true so the loop continues sensibly.
- client_test.go: tighten TestEstimateTotalTokensWithToolContent to
  compare against a same-shape empty-content baseline, so the test
  proves tool_use/tool_result payloads are being counted rather
  than passing on structural overhead alone.
- docs/changelog.md: clarify that the removed wrapper is the
  internal chat.LLMClient interface, distinct from the library's
  llm.Client API which is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dpage

dpage commented May 28, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant