Skip to content

Add tier-aware MCP tool batching#79

Merged
tony merged 10 commits into
mainfrom
improved-batching
Jun 14, 2026
Merged

Add tier-aware MCP tool batching#79
tony merged 10 commits into
mainfrom
improved-batching

Conversation

@tony

@tony tony commented Jun 14, 2026

Copy link
Copy Markdown
Member

Summary

  • Add tier-aware batch wrappers for readonly, mutating, and destructive MCP tool calls.
  • Preserve nested FastMCP validation, middleware, safety checks, and structured per-operation results.
  • Document the new batch tool family and wire it into the FastMCP docs collector.

Verification

  • uv run ruff format .
  • uv run pytest (602 passed)
  • uv run ruff check .
  • uv run mypy $(fd -e py -t f .)
  • just build-docs
  • uv run fastmcp inspect fastmcp.json

tony added 2 commits June 14, 2026 06:04
why: Enable ordered bulk calls while preserving each nested tool's schemas, middleware, and safety checks.
what:
- Add readonly, mutating, and destructive batch wrappers with per-operation results
- Preserve nested FastMCP content, structured_content, and meta
- Extend audit redaction for nested batch arguments
- Cover tier enforcement, continuation, recursion rejection, and audit redaction
why: Keep the published tool catalog and Sphinx FastMCP collector aligned with the new batch tools.
what:
- Add batch tool reference pages and overview navigation
- Register batch tools and models in docs configuration and API reference
- Update README, architecture, and safety summaries
@codecov-commenter

codecov-commenter commented Jun 14, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.12821% with 31 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.73%. Comparing base (3463d8e) to head (98d2e82).

Files with missing lines Patch % Lines
src/libtmux_mcp/tools/batch_tools.py 76.03% 19 Missing and 10 partials ⚠️
src/libtmux_mcp/middleware.py 84.61% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main      #79      +/-   ##
==========================================
- Coverage   84.91%   84.73%   -0.18%     
==========================================
  Files          42       43       +1     
  Lines        3042     3197     +155     
  Branches      412      438      +26     
==========================================
+ Hits         2583     2709     +126     
- Misses        340      359      +19     
- Partials      119      129      +10     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

why: The unreleased notes did not yet announce the readonly,
mutating, and destructive batch wrappers, which are user-facing.
what:
- Add a What's new entry for the tier-aware tool batch family
- Describe the per-tier safety ceiling and per-operation results
@tony

tony commented Jun 14, 2026

Copy link
Copy Markdown
Member Author

Code review

Found 1 issue:

  1. The three batch tools are added to SOCKET_NAME_EXEMPT, but the agent-facing socket_name prose in _BASE_INSTRUCTIONS was not updated in lockstep. server.py is untouched by this PR and still names only list_servers as the exception ("Targeted tmux tools accept socket_name ...; list_servers discovers sockets ..."), so the published instructions now disagree with the four-member exempt set — an agent could try to pass socket_name to call_mutating_tools_batch and hit a schema error. The exempt-set comment makes this a paired requirement (server_tools.py "append it here AND update the prose in _BASE_INSTRUCTIONS so the two stay in lockstep"). It slips through CI because test_registered_tools_accept_socket_name only reads the set to skip the param check — nothing asserts the prose actually mentions the exempt tools.

Lockstep contract:

#: agent-facing contract advertised in
#: :data:`libtmux_mcp.server._BASE_INSTRUCTIONS`. When you add a new
#: discovery-style tool, append it here AND update the prose in
#: ``_BASE_INSTRUCTIONS`` so the two stay in lockstep.
SOCKET_NAME_EXEMPT: frozenset[str] = frozenset(
{
"call_destructive_tools_batch",
"call_mutating_tools_batch",
"call_readonly_tools_batch",
"list_servers",
}
)

Stale prose (unchanged by this PR):

"tmux hierarchy: Server > Session > Window > Pane. "
"Prefer pane_id (e.g. '%1') for targeting. "
"Targeted tmux tools accept socket_name (defaults to LIBTMUX_SOCKET); "
"list_servers discovers sockets via TMUX_TMPDIR plus extra_socket_paths."
)

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

tony added 3 commits June 14, 2026 06:47
why: Batches of output-heavy read tools could multiply the normal
response backstop by embedding each nested result payload in one
outer batch envelope.
what:
- Add batch-level truncation metadata and payload elision
- Limit serialized batch envelopes before returning them
- Cover oversized readonly batches with a regression test
why: MCP clients only see the outer batch tool annotations when
building approval UI, so wrapper hints must disclose the strongest
nested behavior each wrapper can invoke.
what:
- Mark side-effecting batch wrappers destructive and open-world
- Add registration tests for mutating and destructive batch hints
why: The mutating batch docs passed window_name to split_window, whose
schema rejects that argument.
what:
- Show rename_window and split_window targeting the same known window_id
- Avoid implying batches feed a created window id into later operations
@tony

tony commented Jun 14, 2026

Copy link
Copy Markdown
Member Author

Code review

Re-reviewed after the three new commits (response capping, annotation changes, docs example).

Found 1 issue:

  1. The three public batch tool functions use single-paragraph docstrings with no NumPy Parameters/Returns sections, unlike every other registered @mcp.tool in the package (e.g. create_window, snapshot_pane, send_keys_batch), which document operations/on_error/ctx and the return type. (AGENTS.md says "Follow NumPy docstring style for all functions and methods")

async def call_readonly_tools_batch(
operations: list[ToolCallOperation],
on_error: _OnError = "stop",
ctx: Context | None = None,
) -> ToolCallBatchResult:
"""Call readonly MCP tools serially and return per-tool results.
Use when several read-only observations should be made in one agent
turn. Each nested call still goes through FastMCP validation,
middleware, and safety checks. Mutating and destructive tools are
rejected even if the server process itself is running at a higher
safety tier.
"""
return await _call_tools_batch(

Same for call_mutating_tools_batch (L297-L308) and call_destructive_tools_batch (L317-L328).

Still open from the earlier review, not addressed by the new commits: the three batch tools are in SOCKET_NAME_EXEMPT, but _BASE_INSTRUCTIONS still names only list_servers as the socket_name exception (server.py untouched by this branch), against the lockstep contract on that frozenset.

"tmux hierarchy: Server > Session > Window > Pane. "
"Prefer pane_id (e.g. '%1') for targeting. "
"Targeted tmux tools accept socket_name (defaults to LIBTMUX_SOCKET); "
"list_servers discovers sockets via TMUX_TMPDIR plus extra_socket_paths."
)

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

tony added 4 commits June 14, 2026 08:04
why: FastMCP serializes typed tool returns as both text content and
structuredContent, so measuring only the batch model could still leave
large batched responses above the server response cap.
what:
- Measure batch truncation against the FastMCP response envelope
- Extend the oversized batch regression to cover content plus structuredContent
why: A batch with enough small row results can exceed the response cap even
after nested payload truncation, because row metadata alone still serializes
into the FastMCP response envelope.
what:
- Reject generic tool batches above a fixed operation-count cap
- Add a public FastMCP regression for row-only oversized batch responses
why: The unreleased batch entry predated response capping, which keeps
large aggregate results within the server response limit and reports
the truncation to callers.
what:
- Note bounded batch responses in the tier-aware tool batching entry
why: The unreleased batch entry documented the silent response bound
but not the hard rejection callers hit when a batch carries too many
operations.
what:
- Note that oversized batch operation lists are rejected
@tony tony merged commit 5d76aa5 into main Jun 14, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants