-
Notifications
You must be signed in to change notification settings - Fork 3.3k
StreamableHTTP client: _handle_reconnection resets attempt counter to 0, causing infinite retry loop #2393
Description
Bug Description
_handle_reconnection() in streamable_http.py resets the attempt counter to 0 on line 494 when a reconnection succeeds but the stream ends without delivering a complete response. This makes MAX_RECONNECTION_ATTEMPTS ineffective — the counter only applies to consecutive exceptions, not total reconnection attempts. If the server accepts the connection but the stream drops repeatedly, the client retries forever.
Reproduction
MCP version: 1.26.0 (also confirmed unpatched in 1.27.0 main branch)
SSCE — Minimal reproducer
Server (server.py): A server that accepts SSE connections but closes them before sending a complete response, with a last-event-id header to trigger the reconnection path.
"""Minimal MCP server that drops SSE streams to trigger infinite reconnect."""
import asyncio
from starlette.applications import Starlette
from starlette.routing import Route
from starlette.responses import Response
from sse_starlette.sse import EventSourceResponse
import uvicorn, json, uuid
session_id = str(uuid.uuid4())
async def handle_mcp(request):
if request.method == "POST":
body = await request.json()
method = body.get("method")
if method == "initialize":
resp = {
"jsonrpc": "2.0",
"id": body["id"],
"result": {
"protocolVersion": "2025-06-18",
"capabilities": {"tools": {}},
"serverInfo": {"name": "drop-server", "version": "0.1.0"},
},
}
return Response(
json.dumps(resp),
media_type="application/json",
headers={"mcp-session-id": session_id},
)
if method == "notifications/initialized":
return Response(status_code=202, headers={"mcp-session-id": session_id})
if method == "tools/call":
# Return SSE that sends a priming event with an ID, then drops
async def event_generator():
yield {"event": "message", "id": "evt-1", "data": ""}
# Close without sending the actual response
return
return EventSourceResponse(
event_generator(),
headers={"mcp-session-id": session_id},
)
return Response(status_code=404)
if request.method == "GET":
# GET stream for server-initiated messages — also drop immediately
async def get_stream():
yield {"event": "message", "id": "evt-get-1", "data": ""}
return
return EventSourceResponse(
get_stream(),
headers={"mcp-session-id": session_id},
)
if request.method == "DELETE":
return Response(status_code=200)
app = Starlette(routes=[Route("/mcp", handle_mcp, methods=["POST", "GET", "DELETE"])])
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=9999)Client (client.py): Connects and calls a tool, demonstrating the infinite loop.
"""Client that demonstrates infinite reconnection loop."""
import asyncio, logging
logging.basicConfig(level=logging.INFO)
from mcp import ClientSession
from mcp.client.streamable_http import streamable_http_client
async def main():
async with streamable_http_client("http://localhost:9999/mcp") as (read, write, _):
async with ClientSession(read, write) as session:
await session.initialize()
print("Initialized. Calling tool (will hang forever)...")
# This call will trigger _handle_reconnection with attempt=0 reset
try:
result = await asyncio.wait_for(
session.call_tool("any_tool", {}),
timeout=30,
)
except asyncio.TimeoutError:
print("CONFIRMED: call_tool hung for 30s (infinite reconnect loop)")
asyncio.run(main())Steps
pip install mcp[cli] sse-starlette uvicorn- Run server:
python server.py - Run client:
python client.py - Observe: client logs show repeated
"GET stream disconnected, reconnecting in 1000ms..."messages. Thecall_toolnever returns. After 30s thewait_fortimeout fires, confirming the hang.
Root Cause
In streamable_http.py, _handle_reconnection (line 437):
async def _handle_reconnection(self, ctx, last_event_id, retry_interval_ms, attempt=0):
if attempt >= MAX_RECONNECTION_ATTEMPTS: # Only 2
return
# ... reconnects, iterates SSE ...
# Line 494: Stream ended without response — resets attempt to 0!
await self._handle_reconnection(ctx, reconnect_last_event_id, reconnect_retry_ms, 0)When the reconnection succeeds (HTTP 200) but the stream ends without a complete JSONRPCResponse, line 494 recurses with attempt=0, restarting the counter. Only the exception path (line 498) increments the counter. A server that accepts connections but drops streams causes infinite recursion at 1-second intervals.
Expected Behavior
After MAX_RECONNECTION_ATTEMPTS total reconnection attempts (regardless of whether they succeeded at the HTTP level), the client should give up and propagate an error to the caller.
Suggested Fix
Track total attempts across the recursion rather than resetting on successful connect:
# Line 494: increment instead of reset
await self._handle_reconnection(ctx, reconnect_last_event_id, reconnect_retry_ms, attempt + 1)Or add a separate max_total_reconnection_attempts counter that is never reset.
Impact
In production, this causes MCP client coroutines to hang forever when a server experiences transient stream drops. The calling application has no way to recover without wrapping every MCP call in asyncio.wait_for(). We discovered this when an agentquant research job hung for 5+ hours in a reconnection loop after an MCP server's SSE stream dropped.