SIGN IN SIGN UP

feat: daily budget throttling with rate-aware proxy sleep (#469)

## Summary

Add a configurable daily USD spending cap to the Lore gateway. When the
budget is approached, the gateway applies an **invisible proxy-level
sleep** before forwarding requests upstream — progressively slowing
agents rather than hard-blocking them.

## How It Works

1. **Set budget** via the UI dashboard (`/ui/costs`) or env var
`LORE_DAILY_BUDGET`. Persisted in SQLite `kv_meta` table — survives
restarts, easy to tune.

2. **Rate detection**: Cost-rate EMA (α=0.15) tracks spending velocity
across all sessions. Time-gap-adjusted alpha means long idle gaps
naturally decay the EMA.

3. **Throttle curve**: `60s × pressure² × tanh(overshoot/3)` — smooth,
C∞ continuous. Starts at sub-second delays at 60% budget, maxes at 60s
under extreme overshoot. No cliff edges.

4. **Invisible proxy sleep**: The agent doesn't know it's being
throttled — the upstream just appears to take a little longer. A 3-5s
sleep is imperceptible during a normal coding turn.

5. **Cache TTL safety cap**: Sleep is capped at 50% of remaining cache
TTL window to prevent cache busts that would increase costs.

## Throttle Behavior

| Spend % | Rate vs Target | Delay |
|---------|----------------|-------|
| <50%    | any            | **0s** (no throttle) |
| 60%     | 2×             | ~0.8s |
| 70%     | 2×             | ~3.1s |
| 80%     | 2×             | ~6.9s |
| 80%     | 5×             | ~19.2s |
| 95%     | 3×             | ~36.9s |

## Configuration

- **UI**: Set/disable budget from the Costs page (`/ui/costs`)
- **Env var**: `LORE_DAILY_BUDGET=10.00` (overrides DB value)
- **Resolution**: env var > DB > 0 (disabled)
- Default: 0 (disabled, zero overhead on hot path)

## Dashboard

- **Global**: Budget progress bar with spend/budget ratio, current rate,
throttle event count
- **Per-session**: Throttle event count + total delay when throttled

## What's NOT Throttled

- Meta requests (title generation, etc.) — cheap
- Compaction requests — intercepted locally
- Worker calls — their cost IS counted, but they're not delayed (have
their own rate limiting)

## Files Changed

| File | Changes |
|---|---|
| `packages/gateway/src/cost-tracker.ts` | Daily spend accumulator,
cost-rate EMA, throttle curve, DB-backed budget, pre-request cost
estimator |
| `packages/gateway/src/pipeline.ts` | Throttle interception before
`forwardToUpstream()` with cache TTL safety cap |
| `packages/gateway/src/server.ts` | Bootstrap daily spend from DB on
startup |
| `packages/gateway/src/ui.ts` | Budget settings form, progress bar,
throttle diagnostics |
| `packages/gateway/test/budget-throttle.test.ts` | 28 tests covering
all throttle zones, monotonicity, smoothness, EMA, accumulation |

No new files (except test). No DB schema changes. No core package
changes.
B
Burak Yigit Kaya committed
9afaaae168a32c45972af80af0f908a4115cdcbd
Parent: b2e27da
Committed by GitHub <noreply@github.com> on 5/24/2026, 5:41:58 PM