SIGN IN SIGN UP

refactor(llm): inclusive total + non-overlapping breakdown for Usage

Final shape after considering ecosystem conventions:

  inputTokens             — inclusive total (matches AI SDK / OpenAI / LangChain)
  outputTokens            — inclusive total (includes reasoning)
  nonCachedInputTokens    — breakdown: fresh prompt
  cacheReadInputTokens    — breakdown: cache hit
  cacheWriteInputTokens   — breakdown: cache write
  reasoningTokens         — subset of outputTokens

Invariant:
  nonCached + cacheRead + cacheWrite = inputTokens
  reasoningTokens <= outputTokens

Why this shape:

- `inputTokens` keeps its AI-SDK / OpenAI semantics, so a reader from any
  major ecosystem sees the number they expect.
- The non-overlapping breakdown fields are populated alongside the
  inclusive totals — consumers read whichever they need without
  subtracting. This eliminates the underflow bug class (opencode#26620)
  structurally without diverging on naming.
- Aligns with the AI SDK v3 spec proposal (vercel/ai#9921), which adds
  exactly this kind of non-overlapping breakdown to address the active
  ecosystem bugs around cache token double-counting and underflow
  (pydantic-ai#4364, langfuse#12306/#11979, vercel/ai#8349,
  langchain#32818, langchainjs#10249).

Mappers:

- OpenAI Chat / Responses / Bedrock: provider reports inclusive totals
  natively; mapper derives `nonCachedInputTokens` via
  `ProviderShared.subtractTokens`.
- Gemini: `promptTokenCount` is inclusive; `candidatesTokenCount` is
  *exclusive* of `thoughtsTokenCount`, so mapper sums those to produce
  the inclusive `outputTokens`. Only computes the total when the visible
  component is reported (avoids fabricating an inclusive number from a
  partial breakdown).
- Anthropic: `input_tokens` is *non-cached* natively; mapper sums it with
  cache reads/writes to produce the inclusive `inputTokens`.
  `output_tokens` is inclusive (Anthropic doesn't break thinking out, so
  `reasoningTokens` stays undefined).

Added a `visibleOutputTokens` getter (clamped `outputTokens - reasoningTokens`)
as the one safe escape hatch for consumers wanting the non-reasoning view.

Added `ProviderShared.sumTokens` to derive an inclusive total from a
non-overlapping breakdown, returning `undefined` when every input is
undefined (so we don't fabricate a 0).
K
Kit Langton committed
d4ff331052544e23c5b485a501e6fed16ff2539a
Parent: f5d199d