Skip to content

feat: streaming chat, Gemini support, and embedding wrapper cleanup (PR 4 of 4)#175

Open
dpage wants to merge 9 commits into
feat/llm-lib-pr3-proxy-adoptionfrom
feat/llm-lib-pr4-streaming-gemini
Open

feat: streaming chat, Gemini support, and embedding wrapper cleanup (PR 4 of 4)#175
dpage wants to merge 9 commits into
feat/llm-lib-pr3-proxy-adoptionfrom
feat/llm-lib-pr4-streaming-gemini

Conversation

@dpage

@dpage dpage commented May 27, 2026

Copy link
Copy Markdown
Member

Summary

  • Streaming chat in the web UI. ChatInterface.jsx now POSTs to /api/llm/v1/chat/stream and renders the response incrementally as SSE chunks arrive. A new sseChat helper in web/src/utils/ parses the wire format and assembles a final response shape identical to the non-streaming endpoint, so the agentic tool-call loop downstream of the fetch site is unchanged.
  • Gemini provider support. New gemini_api_key / gemini_api_key_file config fields with the same priority chain as the other providers (env > file > config). Registered in the proxy's provider map when set.
  • Tool consumers bypass the embedding wrapper. search_knowledgebase, generate_embedding, and similarity_search now construct an llm.Client directly via a small internal/tools/embed_client.go helper. The internal/embedding/ package is deleted entirely (~510 lines removed).

This is PR 4 of 4 in the LLM library migration. Based on PR 3 (#174). Will need rebase onto main after PR 1, PR 2, PR 3 land.

Deferred to a follow-up PR

The original PR 4 plan also included:

  • Rewriting internal/chat/client.go (the CLI) to use llm.Client directly instead of the chat.LLMClient wrapper.
  • Deleting chat.LLMClient, libClient, and the translation helpers added in PR 1.

These remain in the codebase because they only affect the CLI and don't change user-visible behaviour. A separate clean-up PR will land them.

Test plan

  • make test passes (all 10 server packages, plus the unchanged chat package).
  • Web client npm run build succeeds; 93/93 unit tests pass (7 new for sseChat).
  • Browser smoke deferred to manual verification before merging — streaming behaviour is only meaningful in a real browser session.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Added Google Gemini as an available LLM provider option
    • Web chat interface now streams assistant responses incrementally for real-time feedback
    • Assistant thinking state displays partial streamed content as responses are generated
  • Changed

    • Improved embedding client initialization across search and similarity features

Review Change Stack

@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 95cd6d4e-500e-4e27-b149-291992b3d86c

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This PR integrates three major changes: Google Gemini support as an LLM provider, Server-Sent Events streaming for the web chat interface, and a refactoring that migrates embedding tools from a dedicated package to direct llm.Client usage.

Changes

Gemini LLM Provider Support

Layer / File(s) Summary
Gemini configuration fields and environment loading
internal/config/config.go
LLMConfig adds GeminiAPIKey and GeminiAPIKeyFile struct fields with YAML tags; applyEnvironmentVariables populates GeminiAPIKey from PGEDGE_GEMINI_API_KEY or GEMINI_API_KEY and falls back to readAPIKeyFromFile when the file is set.
HTTP handler registration and error messaging
cmd/pgedge-pg-mcp-svr/main.go
HTTP LLM proxy handler conditionally registers a gemini provider when cfg.LLM.GeminiAPIKey is set; error message for missing provider now mentions gemini_api_key.
Changelog documentation
docs/changelog.md
Unreleased section documents Google Gemini support with configuration keys and environment variable names.

Web Chat SSE Streaming

Layer / File(s) Summary
SSE streaming utility with comprehensive tests
web/src/utils/sseChat.js, web/src/utils/sseChat.test.js
New sseChat helper POSTs to /api/llm/v1/chat/stream, parses SSE frames, and assembles { content, stop_reason, usage } by accumulating text blocks and tool-use chunks; optional onTextChunk callback fires on each text fragment. Tests validate text concatenation, tool assembly, error events, HTTP error responses, token forwarding, frame reassembly, and done.content precedence.
Chat interface streaming integration and message rendering
web/src/components/ChatInterface.jsx, web/src/components/Message.jsx
ChatInterface.jsx imports sseChat and replaces non-streaming /api/llm/v1/chat calls in handleSend and handlePromptExecute with streaming requests; partial text chunks update the assistant "thinking" message in real time; error handling routes AbortError, session expiration (401), and rate limits through existing retry logic. Message.jsx now renders partial streamed content alongside ThinkingIndicator when thinking is true.
Streaming changelog entry
docs/changelog.md
Unreleased section documents web chat adoption of SSE streaming endpoint and embedding tool client changes.

Embedding Tool Refactoring

Layer / File(s) Summary
New embedding client construction helper
internal/tools/embed_client.go
New embedClientConfig struct and newEmbedClient helper validate provider API keys, apply defaults for model names and Ollama base URL, construct an llm.Client via llm.NewClient, and return the client with resolved model name.
Embedding tools migrate to newEmbedClient
internal/tools/generate_embedding.go, internal/tools/search_knowledgebase.go, internal/tools/similarity_search.go
Three embedding tools remove internal/embedding imports and replace embedding.NewProvider(...) calls with newEmbedClient(...), obtaining embedding clients and resolved model names; provider/model metadata now sourced from config values instead of provider instance methods.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the three main changes in the PR: streaming chat support, Gemini provider support, and embedding wrapper cleanup.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/llm-lib-pr4-streaming-gemini

Comment @coderabbitai help to get the list of available commands and usage tips.

@dpage

dpage commented May 27, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@codacy-production

codacy-production Bot commented May 27, 2026

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 4 medium

Results:
4 new issues

Category Results
Complexity 4 medium

View in Codacy

🟢 Metrics 113 complexity · 3 duplication

Metric Results
Complexity 113
Duplication 3

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@dpage

dpage commented May 27, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@dpage dpage force-pushed the feat/llm-lib-pr3-proxy-adoption branch from 0dc2948 to 05b84bf Compare May 28, 2026 08:18
@dpage dpage force-pushed the feat/llm-lib-pr4-streaming-gemini branch from a0feece to cf3e0fd Compare May 28, 2026 08:20
@dpage

dpage commented May 28, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
internal/tools/embed_client.go (1)

40-43: ⚡ Quick win

Normalize provider input before dispatch/client creation.

At Line 42 and Line 85, provider handling is strict (cfg.Provider exact match). Values like "OpenAI" or " openai " will fail even though they’re logically valid config inputs. Normalize once and use the normalized value in switch and llm.NewClient.

♻️ Suggested patch
 import (
 	"fmt"
+	"strings"
@@
 func newEmbedClient(cfg embedClientConfig) (llm.Client, string, error) {
 	var opts llm.Options
-	switch cfg.Provider {
+	provider := strings.ToLower(strings.TrimSpace(cfg.Provider))
+	switch provider {
@@
 	default:
 		return nil, "", fmt.Errorf(
 			"unsupported embedding provider: %s (supported: voyage, openai, ollama)",
-			cfg.Provider,
+			provider,
 		)
 	}
 
-	client, err := llm.NewClient(cfg.Provider, opts)
+	client, err := llm.NewClient(provider, opts)
 	if err != nil {
-		return nil, "", fmt.Errorf("create %s embedding client: %w", cfg.Provider, err)
+		return nil, "", fmt.Errorf("create %s embedding client: %w", provider, err)
 	}

Also applies to: 85-87

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/tools/embed_client.go` around lines 40 - 43, The provider string in
newEmbedClient is not normalized before switching and creating the client, so
inputs like "OpenAI" or " openai " are rejected; fix by trimming whitespace and
lowercasing cfg.Provider at the start of newEmbedClient (e.g., set a local
normalized variable from cfg.Provider via strings.TrimSpace and strings.ToLower)
and use that normalized value in the switch statement and when calling
llm.NewClient so dispatch and client creation accept case/whitespace-insensitive
provider names.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/changelog.md`:
- Around line 68-75: The Unreleased changelog currently contains conflicting
bullets about web chat streaming; update the entries so they are consistent by
either (a) stating that the web chat now consumes the streaming endpoint
`/api/llm/v1/chat/stream` while still retaining the non-streaming endpoint for
compatibility, and referencing the new helper `web/src/utils/sseChat.js` that
assembles SSE chunks, or (b) revert the new bullet if streaming is not yet
enabled; ensure both bullets convey the same status
(streaming-enabled-with-fallback or not enabled) and remove or rewrite the
earlier non-streaming sentence so the Unreleased section is unambiguous.

In `@internal/config/config.go`:
- Around line 458-459: mergeConfig() currently omits copying Gemini API fields
from src.LLM so yaml-provided gemini_api_key / gemini_api_key_file are ignored;
update mergeConfig() to explicitly copy src.LLM.GeminiAPIKey and
src.LLM.GeminiAPIKeyFile into the runtime config (the same way other LLM fields
are merged), e.g., when merging LLM config fields ensure GeminiAPIKey and
GeminiAPIKeyFile on the destination config are set from src.LLM if present, and
add analogous handling in the same merge block that handles the other LLM fields
referenced in mergeConfig().

In `@web/src/components/ChatInterface.jsx`:
- Around line 1302-1331: handlePromptExecute lacks abort handling so streaming
sseChat calls can't be cancelled; create an AbortController in
handlePromptExecute (after setLoading(true)), save its signal and pass signal to
the sseChat invocation (the same call that supplies
messages/tools/provider/model and onTextChunk), and wire cancellation so user
cancel triggers controller.abort(); also update the catch in handlePromptExecute
to treat AbortError specially (ignore or set appropriate state) and ensure any
outer try/catch that awaits handlePromptExecute recognizes aborts—refer to
symbols handlePromptExecute, AbortController, sseChat, onTextChunk, and
setLoading to locate where to add the controller, pass signal, and handle
AbortError.

---

Nitpick comments:
In `@internal/tools/embed_client.go`:
- Around line 40-43: The provider string in newEmbedClient is not normalized
before switching and creating the client, so inputs like "OpenAI" or " openai "
are rejected; fix by trimming whitespace and lowercasing cfg.Provider at the
start of newEmbedClient (e.g., set a local normalized variable from cfg.Provider
via strings.TrimSpace and strings.ToLower) and use that normalized value in the
switch statement and when calling llm.NewClient so dispatch and client creation
accept case/whitespace-insensitive provider names.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 45165b18-a6a5-472e-8e9e-75b22fe5937e

📥 Commits

Reviewing files that changed from the base of the PR and between 05b84bf and cf3e0fd.

📒 Files selected for processing (15)
  • cmd/pgedge-pg-mcp-svr/main.go
  • docs/changelog.md
  • internal/config/config.go
  • internal/embedding/libprovider.go
  • internal/embedding/libprovider_test.go
  • internal/embedding/provider.go
  • internal/embedding/provider_test.go
  • internal/tools/embed_client.go
  • internal/tools/generate_embedding.go
  • internal/tools/search_knowledgebase.go
  • internal/tools/similarity_search.go
  • web/src/components/ChatInterface.jsx
  • web/src/components/Message.jsx
  • web/src/utils/sseChat.js
  • web/src/utils/sseChat.test.js
💤 Files with no reviewable changes (4)
  • internal/embedding/provider.go
  • internal/embedding/provider_test.go
  • internal/embedding/libprovider_test.go
  • internal/embedding/libprovider.go

Comment thread docs/changelog.md
Comment thread internal/config/config.go
Comment thread web/src/components/ChatInterface.jsx
dpage and others added 9 commits May 28, 2026 09:33
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds GeminiAPIKey / GeminiAPIKeyFile fields to LLMConfig with the
same priority resolution as the other providers (env > file >
config). The server now registers a "gemini" entry in the proxy's
provider map when the key is set.

The pgedge-go-llm-lib registers the gemini provider via the
existing _ "llm/all" / _ "llm/provider/gemini" imports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The three embedding-consuming tools (search_knowledgebase,
generate_embedding, similarity_search) no longer go through the
internal/embedding wrapper layer. A shared internal/tools/embed_client.go
helper builds an llm.Client from the per-provider config fields,
applying the same defaults the wrapper used to apply.

newEmbedClient returns (llm.Client, string, error) where the string is
the resolved model name (after defaults such as voyage-3-lite and
nomic-embed-text are applied). generate_embedding.go uses this resolved
model for metadata display instead of provider.ModelName(); it uses
len(vec) for dimensions (populated after the Embed call) and
cfg.Embedding.Provider for the provider name.

The embedding package now has no consumers outside itself; it is
ready to be deleted in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After PR 4's tool refactor (previous commit) nothing in the
codebase imports internal/embedding. The package is removed
entirely — its responsibilities are split between
pgedge-go-llm-lib (provider wire code) and
internal/tools/embed_client.go (config translation + defaults).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an sseChat helper that POSTs to the library proxy's streaming
endpoint, parses Server-Sent Events, and assembles the final response
into the same shape the non-streaming /v1/chat endpoint returns.
Text chunks update the assistant message bubble incrementally as
they arrive.

The agentic loop (chat -> tool_use -> tool_result -> chat ...) keeps
working unchanged: sseChat returns once event: done arrives, the
caller inspects stop_reason and dispatches the next iteration.

Message.jsx now renders streamed partial text while keeping a
thinking indicator visible, so the user sees progress as chunks
arrive instead of just a spinner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(PR 4)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
staticcheck (ST1005) rejects error strings that begin with a capital
letter. The two missing-API-key messages copied across from the old
embedding factory both started with the provider name. Switch to a
"missing <key> for embedding provider '<name>'" phrasing that starts
with a lowercase verb.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
handlePromptExecute previously had no abort handling, so the
streaming sseChat call could not be cancelled by the user. Mirror
the AbortController pattern already used in handleSend: create a
controller, pass its signal to sseChat, treat AbortError as a
benign cancellation in the catch block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- mergeConfig() now copies the Gemini API key and key-file fields
  alongside the other LLM provider fields. Without this, a
  gemini_api_key in the YAML config would be silently dropped at
  load time.
- newEmbedClient() normalises cfg.Provider via TrimSpace+ToLower
  before the switch and llm.NewClient call. Inputs like "OpenAI"
  or " openai " are now accepted; the error message and
  llm.NewClient receive the normalised value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dpage dpage force-pushed the feat/llm-lib-pr4-streaming-gemini branch from cf3e0fd to 3f6070f Compare May 28, 2026 08:37
@dpage

dpage commented May 28, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@dpage dpage marked this pull request as ready for review May 28, 2026 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant