fix(proxy): increase ResponseHeaderTimeout from 30s to 5m#23
Conversation
LLM inference (especially extended thinking models) can take well over 30s before the first response byte arrives. The 30s timeout caused read timeouts on slow-to-start completions through Bedrock and other LLM providers. 5 minutes covers extended thinking while still catching genuinely dead connections. Once the first byte arrives, Go's transport has no further timeout — streaming continues indefinitely, matching the behavior of LLM proxies like Portkey.
|
Claude finished @andybons's task in 1m 15s —— View job PR Review: fix(proxy): increase ResponseHeaderTimeout from 30s to 5m
OverviewCorrect, well-reasoned fix for a real problem. The change bumps CorrectnessAll four transport paths are covered and consistent:
No transport path is left behind. Good. MCP / SSE is safe. The The per-request transport in Design consideration (non-blocking)The timeout is hardcoded in four separate places. Since this is a library module meant to be imported (per CLAUDE.md), downstream callers might need different values for different service profiles. A single exported constant or a // proxy/proxy.go
const defaultResponseHeaderTimeout = 5 * time.MinuteThen referenced as No issues found
Verdict: LGTM. The fix is complete, safe, and well-motivated. |
Summary
Increases
ResponseHeaderTimeoutfrom 30 seconds to 5 minutes across all four transport paths (CONNECT interception, HTTP relay, MCP relay, default transport).Problem
LLM inference — especially with extended thinking models like Claude 3.7 Sonnet via Bedrock — can take well over 30 seconds before the first response byte. The 30s
ResponseHeaderTimeoutcaused intermittent read timeouts on slow-to-start completions.Why 5 minutes
ResponseHeaderTimeoutis time-to-first-byte only — once streaming starts, there's no further timeout (matching Portkey's behavior: "timeout not triggered if it gets at least a chunk")Changes
proxy/proxy.go— two transports (default + interception)proxy/relay.go— HTTP relay transportproxy/mcp.go— MCP relay transportCHANGELOG.md— v0.9.1 entry