SIGN IN SIGN UP

release: 2.0.62 — gpt_native dialect for Codex CLI / Responses route (#115 root cause)

Cascade upstream proto has no native function-calling field, so the proxy
text-emulates tools as <tool_call>{...}</tool_call> XML in the system
prompt. Claude family obeys this protocol; GPT family doesn't — its
training expects native function-calling JSON output, and forcing XML
makes it refuse with "please paste the file" instead of calling the
function (issue #115).

Fix: construct a disguised protocol that matches GPT's natural emission
shape, then parse server-side back into structured tool_calls.

# tool-emulation.js
- New dialect 'gpt_native' (4th in the family: glm47 / kimi_k2 /
  openai_json_xml / gpt_native).
- getToolProtocolHeader('gpt_native') is a strong, anti-refusal preamble
  in bare-JSON form: "Output ONE valid JSON object {function_call:
  {name,arguments}}. No markdown. No prose. Functions ARE available.
  DO NOT respond 'paste me the file'." — 7 explicit rules.
- pickToolDialect(modelKey, provider, route) takes a third route arg.
  Only route='responses' + GPT-family selects gpt_native; chat
  completions path keeps openai_json_xml so existing clients aren't
  surprised. Override with WINDSURFAPI_FORCE_GPT_NATIVE_DIALECT=1.
- formatAssistantToolCallForDialect emits {"function_call":{...}} for
  gpt_native history (so the model sees its own prior turns in the same
  shape it's asked to emit now).
- parseNonOpenAIDialectBuffer routes gpt_native through the existing
  salvage parser, which already handles function_call / tool_calls /
  function / bare {name,arguments} shapes.
- ToolCallStreamParser feed/flush adds 8 JSON sentinels for gpt_native
  ({"function_call", {"tool_calls", {"name", with whitespace variants)
  so streaming holds back text until a JSON tool object closes.

# Route propagation
- applyToolPreambleBudget + 4 tier builders take a route param.
- normalizeMessagesForCascade accepts options.route.
- ToolCallStreamParser({modelKey, provider, route}).
- parseToolCallsFromText({...,route}).
- chat.js handleChatCompletions reads body.__route (responses.js already
  sets __route='responses' on its forwards) and threads it through
  every preamble builder and parser construction. streamResponse deps
  also carries route for retry-loop parser rebuilds.

# Tests: 639 → 654 (+15)
- pickToolDialect: GPT+responses → gpt_native; GPT+chat → openai_json_xml;
  non-GPT+responses → openai_json_xml; GLM/Kimi precedence over GPT;
  WINDSURFAPI_FORCE_GPT_NATIVE_DIALECT=1 force-on.
- gpt_native preamble: function_call shape present, <tool_call> XML
  absent, anti-refusal language present, markdown-fence forbidden,
  chat route keeps XML.
- History serializer: assistant tool_calls round-trip into
  {"function_call":{...}} form (not <tool_call>).
- Parser: extracts {"function_call":{...}} / bare {name,arguments} /
  multiple parallel function_call objects / plain prose passes through;
  stream parser holds back partial JSON until object closes.

Total: 654/654 green. No regressions in existing dialect tests.
D
dwgx committed
d49c53ba3a200c431fac11dc2b7a52291cffca74
Parent: 085b08e