-
-
Notifications
You must be signed in to change notification settings - Fork 51
Fix TestRunner duplicate executions, schema validation, and reduce token usage #295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Resolves two critical issues in the TestRunner_Run tool: 1. Duplicate test executions: Added global callback registration guard to prevent Unity's TestRunnerApi from accumulating duplicate callbacks across multiple tool invocations. Tests now execute exactly once per invocation. 2. MCP schema validation errors: Implemented explicit OutputSchema override in RunTool to suppress schema generation for methods returning ResponseCallTool directly, as it's the protocol wrapper itself, not user data. Additional improvements: - Switched to structured JSON response format using new TestRunResponse class - Enhanced logging for callback registration debugging Files modified: - TestRunner.cs: Added _callbacksRegistered flag and trace logging - TestResultCollector.cs: Switched from text to structured response format - TestRunResponse.cs: New structured response data model - RunTool.OutputSchema.cs: New partial class for schema suppression
…ialization Timeout Fix: - Changed from blocking MainThread.Instance.Run() to non-blocking EditorApplication.update - Processing response now returns immediately without waiting for test execution - Tests start automatically even when Unity is in background (no focus required) - Skip validation for unfiltered runs to avoid unnecessary delays Token Usage Optimization: - Added includePassingTests parameter (default: false) to only include failing test details - Dramatically reduces response size for successful test runs - Summary still shows all test counts regardless of this setting Log Fixes: - Fixed TestLogEntry serialization by converting fields to properties - Fixed duplicate log collection by removing double subscription to log events - Now subscribes only to Application.logMessageReceivedThreaded (captures all threads) Files modified: - TestRunner.Run.cs: Async execution fix, includePassingTests parameter, validation optimization - TestLogEntry.cs: Fields converted to properties for proper JSON serialization - TestResultCollector.cs: includePassingTests filtering, single log subscription
…n if no explicit test filter has been specified. We want to return a nice error if e.g. no tests exists for the default test mode (EditMode)
…the change I made there was necessary in the end)
…er-token-output Fix TestRunner duplicate executions, schema validation, and reduce token usage
| namespace com.IvanMurzak.Unity.MCP.Common | ||
| { | ||
| public partial class RunTool | ||
| { | ||
| private JsonNode? _cachedOutputSchema; | ||
| private bool _outputSchemaComputed = false; | ||
|
|
||
| /// <summary> | ||
| /// Implements OutputSchema to return null for methods that return ResponseCallTool. | ||
| /// ResponseCallTool is the MCP protocol wrapper itself, not user data, so it should not have an output schema. | ||
| /// </summary> | ||
| JsonNode? IRunTool.OutputSchema | ||
| { | ||
| get | ||
| { | ||
| if (_outputSchemaComputed) | ||
| return _cachedOutputSchema; | ||
|
|
||
| _outputSchemaComputed = true; | ||
|
|
||
| if (Method == null) | ||
| { | ||
| _cachedOutputSchema = null; | ||
| return null; | ||
| } | ||
|
|
||
| // Get the actual return type, unwrapping Task<T> if necessary | ||
| var returnType = Method.ReturnType; | ||
| if (returnType.IsGenericType && returnType.GetGenericTypeDefinition() == typeof(Task<>)) | ||
| { | ||
| returnType = returnType.GetGenericArguments()[0]; | ||
| } | ||
|
|
||
| // If the return type is ResponseCallTool, don't generate an output schema | ||
| if (returnType == typeof(ResponseCallTool) || returnType.IsSubclassOf(typeof(ResponseCallTool))) | ||
| { | ||
| _cachedOutputSchema = null; | ||
| return null; | ||
| } | ||
|
|
||
| // For all other types, delegate to MethodWrapper (access via reflection since it's in external DLL) | ||
| var methodWrapperType = typeof(RunTool).BaseType; | ||
| if (methodWrapperType != null) | ||
| { | ||
| var outputSchemaProperty = methodWrapperType.GetProperty("OutputSchema"); | ||
| if (outputSchemaProperty != null) | ||
| { | ||
| _cachedOutputSchema = outputSchemaProperty.GetValue(this) as JsonNode; | ||
| return _cachedOutputSchema; | ||
| } | ||
| } | ||
|
|
||
| _cachedOutputSchema = null; | ||
| return null; | ||
| } | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already implemented in the ReflectorNet in the MethodWrapper class. RunTool is extended from MethodWrapper and it contains that code on board.
Please test without it. it looks redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I tested without the RunTool.OutputSchema.cs file, but it is necessary to prevent MCP validation errors.
Test Results:
Without the override, MCP Inspector shows:
✗ Validation Error: data should have required property 'result'
Root Cause:
- Most MCP tools return data types that get automatically wrapped:
TestRunResponse → { result: TestRunResponse } - TestRunner is unique - it returns ResponseCallTool directly (the MCP protocol wrapper itself)
- Without the override, the base MethodWrapper schema generator creates:
{ "required": ["result"], "properties": { "result": { "$ref": "ResponseCallTool" } } } - But the actual response IS
ResponseCallToolat the top level (not nested in a result field)
Why the override works:
The OutputSchema override returns null for methods returning ResponseCallTool, telling the MCP schema system "don't generate/validate a schema for this - it's the protocol wrapper itself, not user data."
This suppression is necessary because ResponseCallTool is protocol infrastructure, not the actual tool output that should be validated. I'm not familiar with the ReflectorNet code, so maybe I'm misunderstanding - but if I remove the RunTool.OutputSchema.cs class, and I test the TestRunner_Run tool via MCP Inspector, then a response JSON schema that requires a top-level result field gets generated and response validation then fails. See the screenshot below (the validation error is at the bottom, out of view in the screenshot, but you can see the schema that gets generated and that the MCP tool's response doesn't match it):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for detailed reply. You are right about ResponseCallTool is the wrapper type which should not be exposed to MCP at least not in this method.
I need to think a bit more about it. I will get back to you later with an update.
| static readonly object _lock = new(); | ||
| static volatile TestRunnerApi? _testRunnerApi = null!; | ||
| static volatile TestResultCollector? _resultCollector = null!; | ||
| static volatile bool _callbacksRegistered = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bool flag looks redundant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested removing the _callbacksRegistered flag, but unfortunately it is necessary (as far as I can tell).
Test Results:
Without the flag, I get duplicate test results that accumulate with each test run (as shown in the screenshot from MCP Inspector above):
- Run 1: 9 tests → 9 results ✓
- Run 2: 9 tests → 18 results ✗
- Run 3: 9 tests → 27 results ✗
- Run 8: 9 tests → 72 results ✗
In that MCP tool invocation test, I ran the tests from the ConnectionManagerTests class in the Unity-MCP-plugin 8 times back to back. There are 9 tests in that class, so after those 8 runs, the output shows that 72 tests had been run (only 9 tests actually run every time, but the test collector gets registered with Unity's TestRunnerApi class for callbacks an additional time every time the MCP tool executes.
Root Cause:
Unity's TestRunnerApi maintains a global static callback list. Each time CreateInstance() is called (which happens on each test run), RegisterCallbacks() adds another listener to this global list. This causes TestFinished() to fire multiple times for each test, creating duplicate entries.
Why the flag works:
The _callbacksRegistered flag ensures callbacks are registered only once per domain lifetime (until domain reload), preventing callback accumulation across multiple test runs within the same domain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please try to reproduce this test result duplication in the main branch. If you can do that, please describe steps to reproduce.
My main concern about the flag is because if the instance of TestRunnerApi got created already, and we have a second instance just created, we don't know which is a correct one any more. This is the problem. First of all it should not be more then one instance. As only this is true, we don't care about the flag anymore, and we shouldn't because locking the subscription by the flag that won't resolve the core problem.
But first lets double check that the problem exists in the main branch, because I wasn't able to reproduce a week ago with a bit older project version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just switched to main and tried using the TestRunner_Run tool again (using the MCP inspector).
I just entered the name of a class to have the tests for that class executed (ConnectionManagerTests) as the one change to the default parameter values, and then ran the tool. I then see both the validation error and immediately, on the first run, I see 18 tests having been run (the class contains only 9 tests). Running the tool again gives me back a result with 27 passing tests.
Issues Fixed
The test result collector was reporting an extra occurrence for every executed test (2x, 3x, 4x...) after every consecutive test run (leading to a lot of extra output tokens used for large test suites after regularly running tests in a session). Unity's TestRunnerApi maintains a global callback list, and RegisterCallbacks() was called on each CreateInstance(), adding duplicate callbacks.
Fixed by adding _callbacksRegistered flag to ensure callbacks register only once globally.
MCP inspector reported validation errors because the schema generator treated ResponseCallTool (protocol wrapper) as user data.
Fixed by implementing RunTool.OutputSchema.cs to suppress schema generation for methods returning ResponseCallTool.
Each log in test result responses appeared twice due to subscribing to both Application.logMessageReceived and Application.logMessageReceivedThreaded with the same handler.
Fixed by keeping only logMessageReceivedThreaded subscription (which receives log callbacks from all threads).
Switched from plain text to structured JSON using TestRunResponse class. Log entries then only showed LogLevel field because TestLogEntry used fields instead of properties. System.Text.Json requires properties. Fixed by converting fields to auto-properties.
Enhancements
Token Usage Optimization
Added includePassingTests parameter (default: false). When false, only failing test details are included in the response. Summary statistics always show complete counts.
Files Modified
TestRunner.cs - Callback registration guard
TestRunner.Run.cs - includePassingTests parameter
TestResultCollector.cs - Response filtering, log subscription
TestRunResponse.cs - New response model
TestLogEntry.cs - Properties conversion
RunTool.OutputSchema.cs - Schema suppression