AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
feat(platform): dry-run execution mode with LLM block simulation (#12483)
## Why Agent generation and building needs a way to test-run agents without requiring real credentials or producing side effects. Currently, every execution hits real APIs, consumes credits, and requires valid credentials — making it impossible to debug or validate agent graphs during the build phase without real consequences. ## Summary Adds a `dry_run` execution mode to the copilot's `run_block` and `run_agent` tools. When `dry_run=True`, every block execution is simulated by an LLM instead of calling the real service — no real API calls, no credentials consumed, no side effects. Inspired by [Significant-Gravitas/agent-simulator](https://github.com/Significant-Gravitas/agent-simulator). ### How it works - **`backend/executor/simulator.py`** (new): `simulate_block()` builds a prompt from the block's name, description, input/output schemas, and actual input values, then calls `gpt-4o-mini` via the existing OpenRouter client with JSON mode. Retries up to 5 times on JSON parse failures. Missing output pins are filled with `None` (or `""` for the `error` pin). Long inputs (>20k chars) are truncated before sending to the LLM. - **`ExecutionContext`**: Added `dry_run: bool = False` field; threaded through `add_graph_execution()` so graph-level dry runs propagate to every block execution. - **`execute_block()` helper**: When `dry_run=True`, the function short-circuits before any credential injection or credit checks, calls `simulate_block()`, and returns a `[DRY RUN]`-prefixed `BlockOutputResponse`. - **`RunBlockTool`**: New `dry_run` boolean parameter. - **`RunAgentTool`**: New `dry_run` boolean parameter; passes `ExecutionContext(dry_run=True)` to graph execution. ### Tests 11 tests in `backend/copilot/tools/test_dry_run.py`: - Correct output tuples from LLM response - JSON retry logic (3 total calls when first 2 fail) - All-retries-exhausted yields `SIMULATOR ERROR` - Missing output pins filled with `None`/`""` - No-client case - Input truncation at 20k chars - `execute_block(dry_run=True)` skips real `block.execute()` - Response format: `[DRY RUN]` message, `success=True` - `dry_run=False` unchanged (real path) - `RunBlockTool` parameter presence - `dry_run` kwarg forwarding ## Test plan - [x] Run `pytest backend/copilot/tools/test_dry_run.py -v` — all 11 pass - [x] Call `run_block` with `dry_run=true` in copilot; verify no real API calls occur and output contains `[DRY RUN]` - [x] Call `run_agent` with `dry_run=true`; verify execution is created with `dry_run=True` in context - [x] E2E: Simulate button (flask icon) present in builder alongside play button - [x] E2E: Simulated run labeled with "(Simulated)" suffix and badge in Library - [x] E2E: No credits consumed during dry-run
Z
Zamil Majdy committed
a880d734816440eba311b7e471ba579656edd0c5
Parent: 80bfd64
Committed by GitHub <[email protected]>
on 3/24/2026, 10:36:47 PM