Fix TestRunner duplicate executions, schema validation, and reduce token usage #295

ebbe-brandstrup · 2025-10-27T17:45:14Z

Issues Fixed

Duplicate Test Executions

The test result collector was reporting an extra occurrence for every executed test (2x, 3x, 4x...) after every consecutive test run (leading to a lot of extra output tokens used for large test suites after regularly running tests in a session). Unity's TestRunnerApi maintains a global callback list, and RegisterCallbacks() was called on each CreateInstance(), adding duplicate callbacks.

Fixed by adding _callbacksRegistered flag to ensure callbacks register only once globally.

MCP Output Schema Validation

MCP inspector reported validation errors because the schema generator treated ResponseCallTool (protocol wrapper) as user data.

Fixed by implementing RunTool.OutputSchema.cs to suppress schema generation for methods returning ResponseCallTool.

Duplicate Log Entries

Each log in test result responses appeared twice due to subscribing to both Application.logMessageReceived and Application.logMessageReceivedThreaded with the same handler.

Fixed by keeping only logMessageReceivedThreaded subscription (which receives log callbacks from all threads).

Structured Response Format For Test Results

Switched from plain text to structured JSON using TestRunResponse class. Log entries then only showed LogLevel field because TestLogEntry used fields instead of properties. System.Text.Json requires properties. Fixed by converting fields to auto-properties.

Enhancements

Token Usage Optimization

Added includePassingTests parameter (default: false). When false, only failing test details are included in the response. Summary statistics always show complete counts.

Files Modified

TestRunner.cs - Callback registration guard
TestRunner.Run.cs - includePassingTests parameter
TestResultCollector.cs - Response filtering, log subscription
TestRunResponse.cs - New response model
TestLogEntry.cs - Properties conversion
RunTool.OutputSchema.cs - Schema suppression

Resolves two critical issues in the TestRunner_Run tool: 1. Duplicate test executions: Added global callback registration guard to prevent Unity's TestRunnerApi from accumulating duplicate callbacks across multiple tool invocations. Tests now execute exactly once per invocation. 2. MCP schema validation errors: Implemented explicit OutputSchema override in RunTool to suppress schema generation for methods returning ResponseCallTool directly, as it's the protocol wrapper itself, not user data. Additional improvements: - Switched to structured JSON response format using new TestRunResponse class - Enhanced logging for callback registration debugging Files modified: - TestRunner.cs: Added _callbacksRegistered flag and trace logging - TestResultCollector.cs: Switched from text to structured response format - TestRunResponse.cs: New structured response data model - RunTool.OutputSchema.cs: New partial class for schema suppression

…ialization Timeout Fix: - Changed from blocking MainThread.Instance.Run() to non-blocking EditorApplication.update - Processing response now returns immediately without waiting for test execution - Tests start automatically even when Unity is in background (no focus required) - Skip validation for unfiltered runs to avoid unnecessary delays Token Usage Optimization: - Added includePassingTests parameter (default: false) to only include failing test details - Dramatically reduces response size for successful test runs - Summary still shows all test counts regardless of this setting Log Fixes: - Fixed TestLogEntry serialization by converting fields to properties - Fixed duplicate log collection by removing double subscription to log events - Now subscribes only to Application.logMessageReceivedThreaded (captures all threads) Files modified: - TestRunner.Run.cs: Async execution fix, includePassingTests parameter, validation optimization - TestLogEntry.cs: Fields converted to properties for proper JSON serialization - TestResultCollector.cs: includePassingTests filtering, single log subscription

…n if no explicit test filter has been specified. We want to return a nice error if e.g. no tests exists for the default test mode (EditMode)

…the change I made there was necessary in the end)

…output

…er-token-output Fix TestRunner duplicate executions, schema validation, and reduce token usage

Unity-MCP-Plugin/Assets/root/Editor/Scripts/API/Tool/TestRunner.Run.cs

IvanMurzak · 2025-10-28T03:25:00Z

...Plugin/Assets/root/Unity-MCP-Common/Unity-MCP-Common/src/Runner/Tool/RunTool.OutputSchema.cs

+namespace com.IvanMurzak.Unity.MCP.Common
+{
+    public partial class RunTool
+    {
+        private JsonNode? _cachedOutputSchema;
+        private bool _outputSchemaComputed = false;
+
+        /// <summary>
+        /// Implements OutputSchema to return null for methods that return ResponseCallTool.
+        /// ResponseCallTool is the MCP protocol wrapper itself, not user data, so it should not have an output schema.
+        /// </summary>
+        JsonNode? IRunTool.OutputSchema
+        {
+            get
+            {
+                if (_outputSchemaComputed)
+                    return _cachedOutputSchema;
+
+                _outputSchemaComputed = true;
+
+                if (Method == null)
+                {
+                    _cachedOutputSchema = null;
+                    return null;
+                }
+
+                // Get the actual return type, unwrapping Task<T> if necessary
+                var returnType = Method.ReturnType;
+                if (returnType.IsGenericType && returnType.GetGenericTypeDefinition() == typeof(Task<>))
+                {
+                    returnType = returnType.GetGenericArguments()[0];
+                }
+
+                // If the return type is ResponseCallTool, don't generate an output schema
+                if (returnType == typeof(ResponseCallTool) || returnType.IsSubclassOf(typeof(ResponseCallTool)))
+                {
+                    _cachedOutputSchema = null;
+                    return null;
+                }
+
+                // For all other types, delegate to MethodWrapper (access via reflection since it's in external DLL)
+                var methodWrapperType = typeof(RunTool).BaseType;
+                if (methodWrapperType != null)
+                {
+                    var outputSchemaProperty = methodWrapperType.GetProperty("OutputSchema");
+                    if (outputSchemaProperty != null)
+                    {
+                        _cachedOutputSchema = outputSchemaProperty.GetValue(this) as JsonNode;
+                        return _cachedOutputSchema;
+                    }
+                }
+
+                _cachedOutputSchema = null;
+                return null;
+            }
+        }
+    }
+}


It is already implemented in the ReflectorNet in the MethodWrapper class. RunTool is extended from MethodWrapper and it contains that code on board.

Please test without it. it looks redundant.

Thanks for the suggestion! I tested without the RunTool.OutputSchema.cs file, but it is necessary to prevent MCP validation errors.

Test Results:
Without the override, MCP Inspector shows:
✗ Validation Error: data should have required property 'result'

Root Cause:

Most MCP tools return data types that get automatically wrapped: TestRunResponse → { result: TestRunResponse }

TestRunner is unique - it returns ResponseCallTool directly (the MCP protocol wrapper itself)

Without the override, the base MethodWrapper schema generator creates: { "required": ["result"], "properties": { "result": { "$ref": "ResponseCallTool" } } }

But the actual response IS ResponseCallTool at the top level (not nested in a result field)

Why the override works:
The OutputSchema override returns null for methods returning ResponseCallTool, telling the MCP schema system "don't generate/validate a schema for this - it's the protocol wrapper itself, not user data."

This suppression is necessary because ResponseCallTool is protocol infrastructure, not the actual tool output that should be validated. I'm not familiar with the ReflectorNet code, so maybe I'm misunderstanding - but if I remove the RunTool.OutputSchema.cs class, and I test the TestRunner_Run tool via MCP Inspector, then a response JSON schema that requires a top-level result field gets generated and response validation then fails. See the screenshot below (the validation error is at the bottom, out of view in the screenshot, but you can see the schema that gets generated and that the MCP tool's response doesn't match it):

Thanks for detailed reply. You are right about ResponseCallTool is the wrapper type which should not be exposed to MCP at least not in this method.

I need to think a bit more about it. I will get back to you later with an update.

IvanMurzak · 2025-10-28T03:26:26Z

Unity-MCP-Plugin/Assets/root/Editor/Scripts/API/Tool/TestRunner.cs

        static readonly object _lock = new();
        static volatile TestRunnerApi? _testRunnerApi = null!;
        static volatile TestResultCollector? _resultCollector = null!;
+        static volatile bool _callbacksRegistered = false;


This bool flag looks redundant.

I tested removing the _callbacksRegistered flag, but unfortunately it is necessary (as far as I can tell).

Test Results:

Without the flag, I get duplicate test results that accumulate with each test run (as shown in the screenshot from MCP Inspector above):

Run 1: 9 tests → 9 results ✓

Run 2: 9 tests → 18 results ✗

Run 3: 9 tests → 27 results ✗

Run 8: 9 tests → 72 results ✗

In that MCP tool invocation test, I ran the tests from the ConnectionManagerTests class in the Unity-MCP-plugin 8 times back to back. There are 9 tests in that class, so after those 8 runs, the output shows that 72 tests had been run (only 9 tests actually run every time, but the test collector gets registered with Unity's TestRunnerApi class for callbacks an additional time every time the MCP tool executes.

Root Cause:

Unity's TestRunnerApi maintains a global static callback list. Each time CreateInstance() is called (which happens on each test run), RegisterCallbacks() adds another listener to this global list. This causes TestFinished() to fire multiple times for each test, creating duplicate entries.

Why the flag works:

The _callbacksRegistered flag ensures callbacks are registered only once per domain lifetime (until domain reload), preventing callback accumulation across multiple test runs within the same domain.

Please try to reproduce this test result duplication in the main branch. If you can do that, please describe steps to reproduce.

My main concern about the flag is because if the instance of TestRunnerApi got created already, and we have a second instance just created, we don't know which is a correct one any more. This is the problem. First of all it should not be more then one instance. As only this is true, we don't care about the flag anymore, and we shouldn't because locking the subscription by the flag that won't resolve the core problem.

But first lets double check that the problem exists in the main branch, because I wasn't able to reproduce a week ago with a bit older project version.

I just switched to main and tried using the TestRunner_Run tool again (using the MCP inspector).

I just entered the name of a class to have the tests for that class executed (ConnectionManagerTests) as the one change to the default parameter values, and then ran the tool. I then see both the validation error and immediately, on the first run, I see 18 tests having been run (the class contains only 9 tests). Running the tool again gives me back a result with 27 passing tests.

Unity-MCP-Plugin/Assets/root/Editor/Scripts/API/Tool/TestRunner/TestResultCollector.cs

…oose string)

ebbe-brandstrup added 7 commits October 26, 2025 12:38

I think we should always run the TestRunner_Run validation logic, eve…

c29064b

…n if no explicit test filter has been specified. We want to return a nice error if e.g. no tests exists for the default test mode (EditMode)

Back to the original way of deferring the TestRunnerApi (don't think …

64d02c5

…the change I made there was necessary in the end)

Trigger CI

9b88eb8

Merge branch 'IvanMurzak:main' into feature/reduce-test-runner-token-…

0fea4d7

…output

Merge pull request #294 from ebbe-brandstrup/feature/reduce-test-runn…

d23bf35

…er-token-output Fix TestRunner duplicate executions, schema validation, and reduce token usage

ebbe-brandstrup requested a review from IvanMurzak October 27, 2025 20:16

IvanMurzak reviewed Oct 28, 2025

View reviewed changes

Unity-MCP-Plugin/Assets/root/Editor/Scripts/API/Tool/TestRunner.Run.cs Show resolved Hide resolved

IvanMurzak requested changes Oct 28, 2025

View reviewed changes

IvanMurzak reviewed Oct 29, 2025

View reviewed changes

Unity-MCP-Plugin/Assets/root/Editor/Scripts/API/Tool/TestRunner/TestResultCollector.cs Outdated Show resolved Hide resolved

IvanMurzak assigned ebbe-brandstrup Oct 29, 2025

IvanMurzak added the bug Something isn't working label Oct 29, 2025

ebbe-brandstrup added 2 commits October 29, 2025 18:32

Usin an enum for the status on TestResultData objects (instead of a l…

a290d95

…oose string)

Merge branch 'main' into feature/reduce-test-runner-token-output

c1847d7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix TestRunner duplicate executions, schema validation, and reduce token usage #295

Fix TestRunner duplicate executions, schema validation, and reduce token usage #295

Uh oh!

ebbe-brandstrup commented Oct 27, 2025

Uh oh!

Uh oh!

IvanMurzak Oct 28, 2025

Uh oh!

ebbe-brandstrup Oct 28, 2025 •

edited

Loading

Uh oh!

IvanMurzak Oct 29, 2025

Uh oh!

IvanMurzak Oct 28, 2025

Uh oh!

ebbe-brandstrup Oct 28, 2025

Uh oh!

IvanMurzak Oct 29, 2025

Uh oh!

ebbe-brandstrup Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Fix TestRunner duplicate executions, schema validation, and reduce token usage #295

Are you sure you want to change the base?

Fix TestRunner duplicate executions, schema validation, and reduce token usage #295

Uh oh!

Conversation

ebbe-brandstrup commented Oct 27, 2025

Issues Fixed

Enhancements

Files Modified

Uh oh!

Uh oh!

IvanMurzak Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ebbe-brandstrup Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

IvanMurzak Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

IvanMurzak Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

ebbe-brandstrup Oct 28, 2025

Choose a reason for hiding this comment

Test Results:

Root Cause:

Why the flag works:

Uh oh!

IvanMurzak Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

ebbe-brandstrup Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ebbe-brandstrup Oct 28, 2025 •

edited

Loading