Skip to content

Conversation

Deepam02
Copy link
Contributor

@Deepam02 Deepam02 commented Oct 4, 2025

it is a simple solution
we added mcp.tool.error field to traces and tool_error status to metrics when MCP tools fail.

Closes #2084

Copy link

codecov bot commented Oct 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.37%. Comparing base (19a9f7b) to head (4fcc067).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2092      +/-   ##
==========================================
+ Coverage   48.31%   48.37%   +0.05%     
==========================================
  Files         238      238              
  Lines       30034    30067      +33     
==========================================
+ Hits        14512    14545      +33     
  Misses      14423    14423              
  Partials     1099     1099              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JAORMX JAORMX requested a review from Copilot October 4, 2025 17:43
@JAORMX
Copy link
Collaborator

JAORMX commented Oct 4, 2025

@claude please review this

Copy link
Contributor

claude bot commented Oct 4, 2025

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds tool error detection to telemetry middleware to track MCP (Model Context Protocol) tool execution errors through observability signals. When MCP tools fail, the system now captures this information in both traces and metrics.

  • Adds mcp.tool.error attribute to trace spans when tool execution errors are detected
  • Introduces tool_error status category in metrics for failed tool calls
  • Implements lightweight error detection by scanning response payloads for "isError":true patterns

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/telemetry/middleware.go Core implementation of tool error detection logic and telemetry integration
pkg/telemetry/middleware_test.go Unit tests for error detection function and response writer behavior
pkg/telemetry/integration_test.go Integration test verifying end-to-end tool error detection in telemetry pipeline

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Deepam02 Deepam02 requested a review from Copilot October 4, 2025 19:42
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +448 to +451
// Buffer response data for tool calls to enable proper error detection
if rw.isToolCall && !rw.hasToolError {
rw.responseBuffer = append(rw.responseBuffer, data...)
}
Copy link
Preview

Copilot AI Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbounded memory usage: response buffer grows without limits for tool calls. Consider adding a maximum buffer size to prevent potential memory exhaustion on large responses.

Copilot uses AI. Check for mistakes.

Comment on lines +459 to +463
if rw.isToolCall && !rw.hasToolError && len(rw.responseBuffer) > 0 {
rw.hasToolError = detectMCPToolError(rw.responseBuffer)
// Clear buffer to free memory
rw.responseBuffer = nil
}
Copy link
Preview

Copilot AI Oct 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inefficient error detection: detectMCPToolError is called on the complete buffered response, which requires parsing the entire JSON even for large responses. Consider streaming JSON parsing or early detection when isError field is encountered.

Copilot uses AI. Check for mistakes.

@ChrisJBurns
Copy link
Collaborator

@claude try again and review this

Copy link
Contributor

claude bot commented Oct 4, 2025

Claude encountered an error —— View job

Failed with exit code 128

I'll analyze this and get back to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FR: Capture tool execution errors in OTel traces and metrics
3 participants