AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
fix(copilot): add tool call circuit breakers and intermediate persistence (#12604)
## Why CoPilot session `d2f7cba3` took **82 minutes** and cost **$20.66** for a single user message. Root causes: 1. Redis session meta key expired after 1h, making the session invisible to the resume endpoint — causing empty page on reload 2. Redis stream key also expired during sub-agent gaps (task_progress events produced no chunks) 3. No intermediate persistence — session messages only saved to DB after the entire turn completes 4. Sub-agents retried similar WebSearch queries (addressed via prompt guidance) ## What ### Redis TTL fixes (root cause of empty session on reload) - `publish_chunk()` now periodically refreshes **both** the session meta key AND stream key TTL (every 60s). - `task_progress` SDK events now emit `StreamHeartbeat` chunks, ensuring `publish_chunk` is called even during long sub-agent gaps where no real chunks are produced. - Without this fix, turns exceeding the 1h `stream_ttl` lose their "running" status and stream data, making `get_active_session()` return False. ### Intermediate DB persistence - Session messages flushed to DB every **30 seconds** or **10 new messages** during the stream loop. - Uses `asyncio.shield(upsert_chat_session())` matching the existing `finally` block pattern. ### Orphaned message cleanup on rollback - On stream attempt rollback, orphaned messages persisted by intermediate flushes are now cleaned up from the DB via `delete_messages_from_sequence`. - Prevents stale messages from resurfacing on page reload after a failed retry. ### Prompt guidance - Added web search best practices to code supplement (search efficiency, sub-agent scope separation). ### Approach: root cause fixes, not capability limits - **No tool call caps** — artificial limits on WebSearch or total tool calls would reduce autopilot capability without addressing why searches were redundant. - **Task tool remains enabled** — sub-agent delegation via Task is a core capability. The existing `max_subtasks` concurrency guard is sufficient. - The real fixes (TTL refresh, persistence, prompt guidance) address the underlying bugs and behavioral issues. ## How ### Files changed - `stream_registry.py` — Redis meta + stream key TTL refresh in `publish_chunk()`, module-level keepalive tracker - `response_adapter.py` — `task_progress` SystemMessage → StreamHeartbeat emission - `service.py` — Intermediate DB persistence in `_run_stream_attempt` stream loop, orphan cleanup on rollback - `db.py` — `delete_messages_from_sequence` for rollback cleanup - `prompting.py` — Web search best practices ### GCP log evidence ``` # Meta key expired during 82-min turn: 09:49 — GET_SESSION: active_session=False, msg_count=1 ← meta gone 10:18 — Session persisted in finally with 189 messages ← turn completed # T13 (1h45min) same bug reproduced live: 16:20 — task_progress events still arriving, but active_session=False # Actual cost: Turn usage: cache_read=347916, cache_create=212472, output=12375, cost_usd=20.66 ``` ### Test plan - [x] task_progress emits StreamHeartbeat - [x] Task background blocked, foreground allowed, slot release on completion/failure - [x] CI green (lint, type-check, tests, e2e, CodeQL) --------- Co-authored-by: Zamil Majdy <[email protected]>
Z
Zamil Majdy committed
80581a83640ddf96ac14e03600acfcf200506e04
Parent: 3c046eb
Committed by GitHub <[email protected]>
on 3/31/2026, 9:01:56 PM