Anthropic Messages API: add extended (1h) prompt cache TTL behind experiment (#316178)
Re-introduces `cache_control.ttl: "1h"` for the Anthropic Messages
API tools + system breakpoints, gated on the main agent conversation
where the 2x cache-write cost trades favourably against the longer
hit window. Previously reverted from the copilot-chat repo.
All four gates must hold:
- Model is a 1M-context Claude variant (`claude-opus-4-{6,7}-1m...`)
- Setting `github.copilot.chat.anthropic.promptCaching.extendedTtl` is
on (ConfigType.ExperimentBased, default false, advanced/experimental/onExp)
- Location is `ChatLocation.Agent` (Panel/Editor/Terminal/Notebook/
EditingSession/Other and both proxy locations are excluded)
- Request is not a subagent (typed via
`interactionTypeOverride === 'conversation-subagent'`, the same
source of truth as the `X-Interaction-Type` wire header)
When all gates pass:
- The `extended-cache-ttl-2025-04-11` beta header is added
- The last non-deferred tool and the last system block carry
`cache_control: { type: 'ephemeral', ttl: '1h' }`. The two rolling
message breakpoints keep the default 5m TTL, satisfying Anthropic's
longer-TTLs-before-shorter ordering rule.
Tests: messagesApi.spec.ts now at 65 tests (was 59); adds dedicated
`modelSupportsExtendedCacheTtl` and `isExtendedCacheTtlEnabled`
suites covering every gate explicitly. B
Bhavya U committed
c305abcf5246623ddb4e3f1be03f36e2f9dd7caf
Parent: efa9345
Committed by GitHub <noreply@github.com>
on 5/13/2026, 4:27:58 AM