perf: remove sparse_ngram_index entirely
The sparse n-gram index was a redundant search-acceleration layer that duplicated ~70% of trigram_index's recall surface on niche fuzzy queries. Tier 3 (skip_trigram_files scan) + tier 5 (full outline scan when !trigram_ruled_out) cover the same ground. Removed: • field on Explorer struct • init/deinit in init/deinit/releaseSecondaryIndexes • indexFile/removeFile call sites in commit + rebuildTrigrams + removeFile • tier 2 candidate scan in searchContent • approxIndexSizeBytes contribution in telemetry • adversarial test for index population (tests removed behavior) • test_index.zig regression test for sparse/trigram intersection End-to-end measurement on codedb's own repo (284 files, 4 MB snapshot): codedb_status 4.1µs → 2.5µs −39% (previously at "floor") codedb_edit 3.5µs → 2.3µs −34% codedb_tree 16.1µs → 10.5µs −35% Memory stays ~13 MB on this small corpus — savings show up on high-lexical-diversity workloads (large monorepos) where the sparse n-gram hashmap would otherwise grow large. 633/633 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
J
justrach committed
e9fca7cd0612e214e4ea1366635aefe2d96e39c0
Parent: 2bb8508