COMMITS
/ tools/batched-bench/batched-bench.cpp March 31, 2026
A
common : move up common_init() and fix Windows UTF-8 logs (#21176)
Adrien Gallouët committed
March 4, 2026
S
Fix locale-dependent float printing in GGUF metadata (#17331)
SamareshSingh committed
December 22, 2025
J
tool/ex/tests: consistently free ctx, then model (#18168)
Johannes Gäßler committed
November 10, 2025
G
batched-bench : add "separate text gen" mode (#17103)
Georgi Gerganov committed
November 1, 2025
G
scripts : add script to bench models (#16894)
Georgi Gerganov committed
September 8, 2025
G
batched-bench : fix llama_synchronize usage during prompt processing (#15835)
Georgi Gerganov committed
August 30, 2025
J
llama: use FA + max. GPU layers by default (#15434)
Johannes Gäßler committed
August 26, 2025
G
metal : optimize FA vec for large sequences and BS <= 8 (#15566)
Georgi Gerganov committed
August 25, 2025
G
batched-bench : fix unified KV cache handling + pp timing (#15562)
Georgi Gerganov committed
August 19, 2025
G
batched-bench : use rand tokens (#15398)
Georgi Gerganov committed
July 16, 2025
G
llama : add high-throughput mode (#14363)
Georgi Gerganov committed
June 6, 2025
G
llama : deprecate llama_kv_self_ API (#14030)
Georgi Gerganov committed
May 13, 2025
G
batched-bench : fix pp batch contents (#13492)
Georgi Gerganov committed
May 2, 2025
D
llama : move end-user examples to tools directory (#13249)
Diego Devesa committed