COMMITS
/ tests/test-tokenizer-random.py July 14, 2024
C
llama : fix pre-tokenization of non-special added tokens (#8228)
compilade committed
July 7, 2024
C
py : type-check all Python scripts with Pyright (#8341)
compilade committed
July 5, 2024
J
Detokenizer fixes (#8039)
jaime-m-p committed
June 18, 2024
J
tokenizer : BPE fixes (#7530)
jaime-m-p committed
June 4, 2024
J
Per token attributes (#7685)
jaime-m-p committed
May 28, 2024
J
Tokenizer WPM fixes (#7500)
jaime-m-p committed
May 21, 2024
J
Tokenizer SPM fixes for phi-3 and llama-spm (bugfix) (#7425)
jaime-m-p committed
May 20, 2024
J
Tokenizer SPM fixes for phi-3 and llama-spm (#7375)
jaime-m-p committed
May 17, 2024
J
Unicode codepoint flags for custom regexs (#7245)
jaime-m-p committed
May 9, 2024
J
llama3 custom regex split (#6965)
jaime-m-p committed