feat(versioning): time-based retention via Celery beat (FR-007)
Adds a scheduled Celery task that prunes version history older than ``SUPERSET_VERSION_HISTORY_RETENTION_DAYS`` (default 30; settable via env var; ``0`` disables retention entirely). **Task** — ``superset.tasks.version_history_retention.prune_old_versions``: 1. Computes ``cutoff = utcnow() - timedelta(days=N)``. 2. Selects ``version_transaction.id`` rows with ``issued_at < cutoff`` and filters out any tx whose parent shadow includes a live row (``end_transaction_id IS NULL``). The live row is the only preservation rule — closed historical rows including the baseline (``operation_type=0``) age out. Per-entity minimum-history floor is an open question tracked in ``future-work.md``. 3. Deletes rows owned by surviving txs in each parent shadow table (``dashboards_version`` / ``slices_version`` / ``tables_version``). 4. Deletes child-shadow rows for the same transactions (``table_columns_version`` / ``sql_metrics_version`` / ``dashboard_slices_version``). 5. Drops the surviving ``version_transaction`` rows. The ``version_changes`` rows cascade via the FK from the previous commit. Idempotent and safely retried on partial failure. **Schedule** — ``superset/config.py`` adds the task to the default ``CeleryConfig.beat_schedule`` (nightly at 03:00). Operators who override ``CeleryConfig`` in their ``superset_config.py`` need to merge this entry — see UPDATING.md. Also adds ``"expose_headers": ["ETag"]`` to the default ``CORS_OPTIONS`` so cross-origin browser clients can read the ``ETag`` header introduced in the next commit. (Co-located here because both touch ``superset/config.py``; the ETag mechanism itself ships in the next commit.) **Auto-discovery** — ``superset/tasks/celery_app.py`` adds ``version_history_retention`` to its late-imports so Celery's auto-discovery picks up the task. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
M
Mike Bridge committed
801d58687b160003962907cfb665ac3be0ab89eb
Parent: 8fe9a8c