COMMITS
/ examples/parallel/parallel.cpp March 31, 2026
A
common : move up common_init() and fix Windows UTF-8 logs (#21176)
Adrien Gallouët committed
March 4, 2026
S
Fix locale-dependent float printing in GGUF metadata (#17331)
SamareshSingh committed
December 14, 2025
G
common : refactor common_sampler + grammar logic changes (#17937)
Georgi Gerganov committed
July 18, 2025
G
parallel : add option for different RNG seeds (#14757)
Georgi Gerganov committed
July 16, 2025
G
llama : add high-throughput mode (#14363)
Georgi Gerganov committed
June 6, 2025
G
llama : deprecate llama_kv_self_ API (#14030)
Georgi Gerganov committed
June 1, 2025
G
parallel : fix n_junk == 0 (#13952)
Georgi Gerganov committed
May 31, 2025
G
llama : auto-batch preparation (#13845)
Georgi Gerganov committed
G
kv-cache : refactor + add llama_memory_state_i (#13746)
Georgi Gerganov committed
May 30, 2025
G
parallel : increase the variability of the prompt lengths (#13927)
Georgi Gerganov committed
May 20, 2025
G
llama : remove llama_kv_cache_view API + remove deprecated (#13653)
Georgi Gerganov committed
May 17, 2025
G
parallel : add option for non-shared and larger prompts (#13598)
Georgi Gerganov committed
April 2, 2025
G
llama : refactor kv cache guard (#12695)
Georgi Gerganov committed
April 1, 2025
X
common : refactor downloading system, handle mmproj with -hf option (#12694)
Xuan-Son Nguyen committed
March 13, 2025
G
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
Georgi Gerganov committed
March 4, 2025
M
ggml : portability fixes for VS 2017 (#12150)
mgroeber9110 committed
January 12, 2025
G
llama : add `llama_vocab`, functions -> methods, naming (#11110)
Georgi Gerganov committed
January 3, 2025
G
llama : refactor `src/llama.cpp` (#10902)
Georgi Gerganov committed
November 25, 2024
G
speculative : refactor and add a simpler example (#10362)
Georgi Gerganov committed
October 18, 2024
X
llama : remove all_pos_0, all_pos_1, all_seq_id from llama_batch (#9745)
Xuan Son Nguyen committed
October 10, 2024
D
common : use common_ prefix for common library functions (#9805)
Diego Devesa committed
September 15, 2024
G
common : reimplement logging (#9418)
Georgi Gerganov committed
September 13, 2024
G
llama : llama_perf + option to disable timings during decode (#9355)
Georgi Gerganov committed
September 9, 2024
X
common : move arg parser code to `arg.cpp` (#9388)
Xuan Son Nguyen committed
September 7, 2024
X
common : refactor arg parser (#9308)
Xuan Son Nguyen committed
G
llama : refactor sampling v2 (#9294)
Georgi Gerganov committed
August 5, 2024
L
common : Changed tuple to struct (TODO fix) (#8823)
Liu Jia committed
June 4, 2024
G
common : refactor cli arg parsing (#7675)
Georgi Gerganov committed
May 22, 2024
G
common : normalize naming style (#7462)
Georgi Gerganov committed
April 21, 2024
P
llama : support Llama 3 HF conversion (#6745)
Pedro Cuenca committed
March 26, 2024
C
llama : greatly reduce output buffer memory usage (#6122)
compilade committed
March 8, 2024
C
llama : support Mamba Selective State Space Models (#5328)
compilade committed
February 16, 2024
B
ggml : add numa options (#5377)
bmwl committed
November 23, 2023
G
llama : KV cache view API + better KV cache management (#4170)
Georgi Gerganov committed
D
examples : fix typo in parallel example doc comment (#4181)
Daniel Bevenius committed
November 2, 2023
C
build : link against build info instead of compiling against it (#3879)
cebtenzzre committed
October 23, 2023
M
llama : remove token functions with `context` args in favor of `model` (#3720)
Marcus Dunn committed
October 20, 2023
G
sampling : refactor init to use llama_sampling_params (#3696)
Georgi Gerganov committed
October 18, 2023
G
speculative : add tree-based sampling example (#3624)
Georgi Gerganov committed
October 11, 2023
K
common : fix mirostat state when using multiple sequences (#3543)
Kerfuffle committed
October 9, 2023
G
refact : fix convert script + zero out KV cache to avoid nans (#3523)
Georgi Gerganov committed
October 6, 2023
P
parallel : add option to load external prompt file (#3416)
pudepiedj committed
October 3, 2023
G
llama : fix session saving/loading (#3400)
Georgi Gerganov committed
September 28, 2023
S
G
llama : custom attention mask + parallel decoding + no context swaps (#3228)
Georgi Gerganov committed