COMMITS
December 27, 2024
S
Fix README.md typo
Stephen Panaro committed
S
Add M3 Max to README.md
Stephen Panaro committed
November 9, 2024
S
Add Llama 3.2 3B tokenizer
Stephen Panaro committed
October 13, 2024
S
Support faster chunked logit processing
Stephen Panaro committed
October 12, 2024
S
Better error when pipeline chunks aren't found
Stephen Panaro committed
S
Support for Llama-3.2-1B-Instruct
Stephen Panaro committed
October 6, 2024
S
Support prompts longer than the input length
Stephen Panaro committed
S
Update M1 Max performance
Stephen Panaro committed
S
Only update the KV cache once per 64 tokens
Stephen Panaro committed
August 15, 2024
S
Fix footnote in README
Stephen Panaro committed
S
Exclude sequoia model
Stephen Panaro committed
S
Port over stale model checking from sequoia branch
Stephen Panaro committed
S
Update README to document faster convolution trick
Stephen Panaro committed
July 12, 2024
S
Add M2 Pro timings to the README
Stephen Panaro committed
May 31, 2024
S
Add more M series stats
Stephen Panaro committed
S
Switch to non-gated tokenizer
Stephen Panaro committed
May 29, 2024
S
Add M3 stats to readme
Stephen Panaro committed
May 28, 2024
S
update readme
Stephen Panaro committed
May 25, 2024
S
initial commit
Stephen Panaro committed