SIGN IN SIGN UP
upstash / context7 UNCLAIMED

Context7 Platform -- Up-to-date code documentation for LLMs and AI code editors

0 0 6 TypeScript

Fix/mcp stream tool responses (#2526)

* fix(mcp): stream tool-call responses so headers flush before 60s

The remote MCP server's StreamableHTTPServerTransport runs in JSON mode
(`enableJsonResponse: true`), which buffers the entire response and writes
status + headers + body together at the end of the tool call. Long-running
tools — notably `query-docs` with `researchMode: true` — routinely take
60–250s, during which no bytes are written to the wire.

MCP HTTP clients cap the underlying `fetch()` waiting for headers
(Claude Code: 60s, hardcoded in @modelcontextprotocol/sdk consumers),
independent of the higher-level per-tool timeout. Production curl with
`-w time_starttransfer` shows `start_transfer ≈ total ≈ 125s` for a
research call — the connection sits silent for the full duration before
flushing in one burst, well past any reasonable client `fetch` timeout.

Switch to SSE responses for POST tool calls. The SDK then returns the
HTTP response synchronously after parsing the request, headers flush in
ms, and the body streams while the tool runs. Same total wall time, but
clients see headers immediately and don't time out.

The existing NGINX-timeout comment above (about rejecting GETs) is about
the standalone GET SSE channel for server-initiated notifications and
still applies — GETs remain rejected. Per-request POST SSE responses are
bounded by the tool call and work fine on the existing ingress
(`proxy-buffering: off`, `proxy-read-timeout: 3600`).

Streamable HTTP requires clients to accept both `application/json` and
`text/event-stream` (SDK enforces 406 otherwise), so this is transparent
to compliant clients including Claude Code.

* remove comment

* chore: add changeset

* fix(mcp): emit progress notifications during researchMode query-docs

The SSE-streaming change in the previous commit only addresses clients
whose timeout fires when response *headers* don't arrive (e.g. Claude
Code's `wrapFetchWithTimeout`). Clients using the MCP SDK's default
`Protocol.request()` timer (`DEFAULT_REQUEST_TIMEOUT_MSEC = 60000`) hit
a wall-clock timeout that bytes flowing don't reset.

Emit `notifications/progress` every 20s while the upstream fetch is in
flight, gated on `researchMode: true` and the client supplying a
`progressToken` in `_meta`. Clients that pass an `onprogress` handler
have the SDK include `progressToken` automatically; on each notification
the SDK resets their JSON-RPC request timer (when they also opted into
`resetTimeoutOnProgress: true`), keeping 60–250s research runs alive.

Clients that don't include a `progressToken` see no notifications and no
behavior change. Fast `query-docs` calls are unaffected — the interval
only arms when `researchMode` is true.

* Merge branch 'master' of https://github.com/upstash/context7 into fix/mcp-stream-tool-responses

# Conflicts:
#	packages/mcp/src/index.ts

* fix(mcp): restore researchMode and progress notifications on query-docs

Re-add the researchMode parameter to the query-docs input schema and
re-emit periodic notifications/progress while the upstream call is in
flight. Clients that opt into resetTimeoutOnProgress (e.g. opencode)
reset their per-request timer on each notification, which keeps the
long-running researchMode call alive past the SDK's default 60s
wall-clock timeout.

* fix(mcp): create a fresh McpServer per HTTP request

The HTTP transport is stateless (sessionIdGenerator: undefined), but the
handler shared one global McpServer across requests. McpServer extends
Protocol, which has a single _transport field. Each server.connect()
overwrites it, and any transport.close on any request fires the shared
Protocol's onclose, leaving _transport undefined for everyone else.

That meant a long-running researchMode call lost its transport every
time an unrelated short request (tool list refresh, init confirmation,
etc.) closed in the background, surfacing as "Not connected" on every
subsequent sendNotification and ultimately a -32001 timeout on the
client.

Switch to the per-request pattern from the SDK's
simpleStatelessStreamableHttp example: a createMcpServer factory builds
a fresh server, registers tools, and is closed alongside the transport
on res.on('close'). stdio mode keeps a single server, since stdio has
exactly one transport for the process lifetime.

* Merge remote-tracking branch 'origin/master' into fix/mcp-stream-tool-responses

# Conflicts:
#	packages/mcp/src/index.ts

* chore(mcp): minimize PR diff against master

Drop the now-redundant changeset that duplicated the SSE-streaming note
already released in 2.2.3, restore comments inadvertently dropped during
the createMcpServer refactor, and rename the changeset file to reflect
what this PR actually changes.

Net diff vs master is now wrapping the existing setup in a
createMcpServer factory, calling it per HTTP request, and closing the
server alongside the transport (12 logical lines added, ignoring
indentation introduced by Prettier).

* fix(mcp): move client info capture to MCP server initialization for stdio mode
F
Fahreddin Özcan committed
d0e4a4834e76a91ecfa0c05332a7f7de01daa822
Parent: 8782725
Committed by GitHub <noreply@github.com> on 4/30/2026, 9:02:56 AM