SIGN IN SIGN UP

Scrub stale reverse edges on DiskANN delete (data leak fix)

After deleting a node, its rowid and quantized vector remained in
other nodes' neighbor blobs via unidirectional reverse edges. This
is a data leak — the deleted vector's compressed representation was
still readable in shadow tables.

Fix: after deleting the node and repairing forward edges, scan all
remaining nodes and clear any neighbor slot that references the
deleted rowid. Uses a lightweight two-pass approach: first scan
reads only validity + neighbor_ids to find affected nodes, then
does full read/clear/write only for those nodes.

Tradeoff: O(N) scan per delete adds ~1ms/row at 10k vectors, ~10ms
at 100k. Recall and query latency are unaffected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A
Alex Garcia committed
01b4b2a965b7471831d390d38d475595a9acde34
Parent: c36a995