Anthropic 1h cache TTL: expand model list, add message-breakpoint sub-toggle (#316679)
* Expand extended cache TTL model list + add message-breakpoint sub-toggle
Two related changes to Anthropic Messages API prompt caching:
1. Expand modelSupportsExtendedCacheTtl beyond just the 1M context
variants. Per Anthropic docs, the 1h cache TTL is available on all
active models; this widens our opt-in list to Claude Opus 4.5/4.6/4.7
and Sonnet 4.5/4.6 (all variants, not just -1m).
2. Add a new experiment-based setting
chat.anthropic.promptCaching.extendedTtlMessages as a strict
sub-toggle of the existing extendedTtl setting. When both are on, the
rolling message-level breakpoints (last cacheable user / tool-result
blocks set by addMessagesApiCacheControl) also use the 1h TTL
instead of the default 5m. Nested rather than orthogonal because
Anthropic requires longer-TTL breakpoints to appear before shorter
ones in the tools->system->messages prefix order.
Tests: 69 passed. Added a suite for isExtendedCacheTtlMessagesEnabled
(parent on/off x sub on/off matrix + inherited model/location/subagent
gates) and two tests verifying addMessagesApiCacheControl propagates
the new cacheTtl argument.
* Slim down extended cache TTL tests
- Trim isExtendedCacheTtlEnabled model-list test to just verify the
delegation (full boundaries are covered by modelSupportsExtendedCacheTtl).
- Remove redundant 'inherits gates from parent' test in
isExtendedCacheTtlMessagesEnabled suite — the parent×sub matrix plus
the parent's own gate tests already cover this.
- Merge the two addMessagesApiCacheControl ttl tests into one
parameterized assertion.
* Update stale comment about message breakpoint TTL
The comment claimed message breakpoints 'always use the default 5m TTL',
but that's no longer true when the new extendedTtlMessages sub-toggle is on.
* Refactor extended cache TTL: pass parentEnabled, drop misleading coercions
- isExtendedCacheTtlMessagesEnabled now takes parentEnabled: boolean
instead of re-running the parent gate. Call site passes the resolved
useExtendedCacheTtl directly, eliminating a duplicate experiment-service
lookup per request. Makes the 'sub-toggle of' relationship literal in
the signature.
- Drop the !! coercion on getExperimentBasedConfig<boolean> returns —
the generic guarantees T, so the coercion was misleading defensive
noise.
- Narrow cacheTtl param from '5m' | '1h' to just '1h' on both
addToolsAndSystemCacheControl and addMessagesApiCacheControl. Per
Anthropic docs, { type: 'ephemeral' } already defaults to 5m, so '5m'
is never actually emitted on the wire and call sites never passed it.
- Stronger composition test for isExtendedCacheTtlEnabled — replaces
four single-axis tests with one table-driven matrix that exercises
all four gates simultaneously, catching short-circuit refactors.
- Table-driven 2x2 matrix for isExtendedCacheTtlMessagesEnabled.
- Trim user-facing extendedTtlMessages setting description; team-only
rationale (rolling breakpoints, 2x write premium) lives in the JSDoc.
- Update stale comment claiming message breakpoints always use 5m.
* Remove outdated comments about extended cache TTL models B
Bhavya U committed
0bd2387d0e948846941d82641fc16bb5f08a6690
Parent: 7e3f0c1
Committed by GitHub <noreply@github.com>
on 5/15/2026, 9:57:03 PM