Skip to content

Improvements to the MCP tools service and enhances LLM API parameter handling#291

Merged
locnguyen1986 merged 2 commits intoreleasefrom
main
Dec 9, 2025
Merged

Improvements to the MCP tools service and enhances LLM API parameter handling#291
locnguyen1986 merged 2 commits intoreleasefrom
main

Conversation

@locnguyen1986
Copy link
Collaborator

  • Adds retry, circuit breaker, and validation infrastructure to handle external API failures gracefully
  • Extends LLM completion request handling to support provider-specific parameters (TopK, RepetitionPenalty)
  • Removes/disables MCP features (DomainAllowList parameter, file search tools)

Adds retry, circuit breaker, and validation infrastructure to handle external API failures gracefully
Extends LLM completion request handling to support provider-specific parameters (TopK, RepetitionPenalty)
Removes/disables MCP features (DomainAllowList parameter, file search tools)
Copilot AI review requested due to automatic review settings December 9, 2025 11:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enhances the reliability of external API calls in the MCP tools service and extends LLM API parameter handling to support provider-specific options. The changes add production-ready resilience patterns (retry, circuit breaker, validation) to handle transient failures gracefully while enabling richer LLM inference configuration through TopK and RepetitionPenalty parameters.

Key Changes:

  • Implements exponential backoff retry, circuit breaker, and response validation infrastructure for Serper and SearXNG search APIs
  • Extends LLM completion requests to support TopK and RepetitionPenalty parameters with model catalog defaults
  • Removes DomainAllowList parameter and disables file search tools (file_search_index, file_search_query) from MCP service

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
services/mcp-tools/internal/interfaces/httpserver/routes/mcp/serper_mcp.go Removes DomainAllowList field and comments out file search tools (index/query); removes unused fmt import
services/mcp-tools/internal/infrastructure/search/validation.go Adds validation logic for search/fetch responses with enrichment for empty results; includes helpful error context
services/mcp-tools/internal/infrastructure/search/retry.go Implements exponential backoff retry with configurable retryable errors and jitter to prevent thundering herd
services/mcp-tools/internal/infrastructure/search/circuit_breaker.go Implements circuit breaker pattern with closed/open/half-open states for external API fault tolerance
services/mcp-tools/internal/infrastructure/search/client.go Integrates retry and circuit breaker into search client for Serper and SearXNG calls; adds response validation
services/llm-api/internal/utils/httpclients/chat/chat_completion_client.go Extends request type with TopK/RepetitionPenalty; adds detailed logging for request parameters
services/llm-api/internal/interfaces/httpserver/requests/chat/chat.go Adds TopK and RepetitionPenalty fields to incoming chat completion request
services/llm-api/internal/interfaces/httpserver/handlers/modelhandler/provider_handler.go Exposes GetModelCatalogByID method to retrieve catalog details for default parameter application
services/llm-api/internal/interfaces/httpserver/handlers/chathandler/chat_handler.go Applies model catalog defaults to LLM requests; converts decimal catalog values to appropriate types
services/llm-api/internal/domain/model/provider_model_service.go Adds FindCatalogByID service method to support retrieving catalog for defaults
services/llm-api/internal/domain/model/model_catalog_service.go Adds top_k and repetition_penalty to supported parameters list for model catalog

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Float32("repetition_penalty", repetitionPenalty).
Msg("[ChatCompletion][Stream] Sending request to inference server")

log.Info().Interface("request", request).Msg("[ChatCompletion][Stream] Request body")
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging the entire request object at Info level (line 557) could expose sensitive data like API keys, user messages, or PII in logs. The request may contain conversation history with personal information. Consider using Debug level or sanitizing/redacting sensitive fields before logging.

Suggested change
log.Info().Interface("request", request).Msg("[ChatCompletion][Stream] Request body")
log.Debug().Interface("request", request).Msg("[ChatCompletion][Stream] Request body")

Copilot uses AI. Check for mistakes.
req.MaxTokens = val
}
}
if req.TopK == nil || (req.TopK != nil && *req.TopK == 0) {
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition req.TopK == nil || (req.TopK != nil && *req.TopK == 0) is redundant. If req.TopK == nil is true, the second part of the OR will never be evaluated. Simplify to req.TopK == nil || *req.TopK == 0.

Suggested change
if req.TopK == nil || (req.TopK != nil && *req.TopK == 0) {
if req.TopK == nil || *req.TopK == 0 {

Copilot uses AI. Check for mistakes.
resp.SearchParameters = map[string]any{}
}
resp.SearchParameters["empty_result_reason"] = reason
resp.SearchParameters["has_results"] = len(resp.Organic) > 0
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is inconsistent: has_results is set based on len(resp.Organic) > 0 after potentially populating resp.Organic with a fallback entry. This means has_results will always be true even when no actual results were found (only the enriched fallback exists). Consider checking if results were enriched or setting this flag before adding the fallback entry.

Copilot uses AI. Check for mistakes.
}

// Add 10% jitter to prevent thundering herd
jitter := backoff * 0.1 * (2.0*float64(time.Now().UnixNano()%100)/100.0 - 1.0)
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The jitter calculation may produce biased results. Using time.Now().UnixNano() % 100 for randomness doesn't provide uniform distribution across retries and can exhibit patterns. Consider using a proper random number generator (e.g., math/rand or crypto/rand) instead.

Copilot uses AI. Check for mistakes.
req.TopK = &val
}
}
if req.RepetitionPenalty == nil || (req.RepetitionPenalty != nil && *req.RepetitionPenalty == 0) {
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition req.RepetitionPenalty == nil || (req.RepetitionPenalty != nil && *req.RepetitionPenalty == 0) is redundant. If req.RepetitionPenalty == nil is true, the second part of the OR will never be evaluated. Simplify to req.RepetitionPenalty == nil || *req.RepetitionPenalty == 0.

Suggested change
if req.RepetitionPenalty == nil || (req.RepetitionPenalty != nil && *req.RepetitionPenalty == 0) {
if req.RepetitionPenalty == nil || *req.RepetitionPenalty == 0 {

Copilot uses AI. Check for mistakes.
if val == nil {
return 0, false
}
f, _ := val.Float64()
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error returned from Float64() conversion is being silently discarded with _. If the decimal-to-float conversion fails, this function will return 0 without indication. Consider checking and handling the conversion error, or at least document why it's safe to ignore.

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +102
if cb.halfOpenCalls < cb.cfg.MaxHalfOpenCalls {
cb.halfOpenCalls++
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The halfOpenCalls counter is incremented in allowRequest() but only decremented in recordResult() on state transitions. If a request is allowed in half-open state but the result is never recorded (e.g., panic, context cancellation), the counter will not be decremented, potentially leaving the circuit breaker stuck. Consider using a defer pattern or separate increment/decrement tracking to ensure the counter is properly managed.

Copilot uses AI. Check for mistakes.
resp.Organic = []map[string]any{
{
"title": fmt.Sprintf("No results found for: %s", query),
"link": fmt.Sprintf("https://google.com/search?q=%s", strings.ReplaceAll(query, " ", "+")),
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The query parameter in the Google search fallback URL is not properly URL-encoded. Using strings.ReplaceAll(query, " ", "+") only handles spaces but doesn't escape special characters like &, =, ?, etc. Use url.QueryEscape() from the net/url package for proper URL encoding.

Copilot uses AI. Check for mistakes.
resp.Metadata = map[string]any{}
}
resp.Metadata["empty_result_reason"] = reason
resp.Metadata["has_content"] = strings.TrimSpace(resp.Text) != ""
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is inconsistent: has_content is set based on strings.TrimSpace(resp.Text) != "" after potentially setting resp.Text to a failure message. This means has_content will always be true even when no actual content was fetched (only the enriched error message exists). Consider checking if content was enriched or setting this flag before adding the error message.

Copilot uses AI. Check for mistakes.
Float32("repetition_penalty", repetitionPenalty).
Msg("[ChatCompletion] Sending request to inference server")

log.Info().Interface("request", request).Msg("[ChatCompletion] Request body")
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging the entire request object at Info level (line 169) could expose sensitive data like API keys, user messages, or PII in logs. The request may contain conversation history with personal information. Consider using Debug level or sanitizing/redacting sensitive fields before logging.

Suggested change
log.Info().Interface("request", request).Msg("[ChatCompletion] Request body")
log.Debug().Interface("request", request).Msg("[ChatCompletion] Request body")

Copilot uses AI. Check for mistakes.
@locnguyen1986 locnguyen1986 merged commit 85cd2b7 into release Dec 9, 2025
13 of 14 checks passed
@github-project-automation github-project-automation bot moved this to QA in Jan Dec 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: QA

Development

Successfully merging this pull request may close these issues.

3 participants