SIGN IN SIGN UP

vocab : fix HybridDNA tokenizer (#23466)

* vocab : mark hybriddna k-mers to avoid BPE token collisions

* improved loop

---------

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
K
Kashif Rasul committed
afcda09d154a285cd366135f98ffc1d357f7ddbd
Parent: bbce619
Committed by GitHub <noreply@github.com> on 5/22/2026, 9:17:31 AM