COMMITS
/ common/common.cpp April 2, 2026
R
tests: allow exporting graph ops from HF file without downloading weights (#21182)
Ruben Ortlam committed
March 31, 2026
A
common : move up common_init() and fix Windows UTF-8 logs (#21176)
Adrien Gallouët committed
March 28, 2026
S
fix **/x glob matching (#21129)
Sigbjørn Skjæret committed
S
common : add character class support to glob_match (#21111)
Sigbjørn Skjæret committed
S
cli : add /glob command (#21084)
Sigbjørn Skjæret committed
March 18, 2026
P
llama : re-enable manual LoRA adapter free (#19983)
Pop Flamingo committed
March 6, 2026
P
Autoparser - complete refactoring of parser architecture (#18675)
Piotr Wilkin (ilintar) committed
February 23, 2026
D
llama : remove write/read of output ids/logits/embeddings (#18862)
Daniel Bevenius committed
February 18, 2026
A
common : make small string helpers as inline functions (#19693)
Adrien Gallouët committed
February 14, 2026
I
NetBSD build support (#19589)
iMil committed
A
llama : update LoRA API. + fix excessive graph reserves (#19280)
agent-enemy-2 committed
February 12, 2026
A
common : replace deprecated codecvt using parse_utf8_codepoint (#19517)
Adrien Gallouët committed
February 11, 2026
D
common : remove unused token util functions (#19506)
Daniel Bevenius committed
January 28, 2026
S
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)
Sascha Rogmann committed
January 15, 2026
G
context : reserve new scheduler when graph topology changes (#18547)
Georgi Gerganov committed
January 8, 2026
J
llama-fit-params: free memory target per device (#18679)
Johannes Gäßler committed
J
llama : add `use_direct_io` flag for model loading (#18166)
Julius Tischbein committed
January 4, 2026
D
sampling : add support for backend sampling (#17004)
Daniel Bevenius committed
December 30, 2025
X
lora: count lora nodes in graph_max_nodes (#18469)
Xuan-Son Nguyen committed
December 29, 2025
O
common: fix return value check for setpriority (#18412)
o7si committed
December 27, 2025
J
llama: fix magic number of 999 for GPU layers (#18266)
Johannes Gäßler committed
December 22, 2025
J
tool/ex/tests: consistently free ctx, then model (#18168)
Johannes Gäßler committed
December 17, 2025
J
common: clarify instructions for bug reports (#18134)
Johannes Gäßler committed
December 15, 2025
J
December 14, 2025
G
common : refactor common_sampler + grammar logic changes (#17937)
Georgi Gerganov committed
December 7, 2025
S
common : change --color to accept on/off/auto, default to auto (#17827)
Sigbjørn Skjæret committed
December 4, 2025
A
common: use native MultiByteToWideChar (#17738)
Adrien Gallouët committed
December 3, 2025
R
ggml webgpu: add support for emscripten builds (#17184)
Reese Levine committed
December 2, 2025
X
server: add --media-path for local media files (#17697)
Xuan-Son Nguyen committed
December 1, 2025
X
server: introduce API for serving / loading / unloading multiple models (#17470)
Xuan-Son Nguyen committed
November 25, 2025
A
llama: introduce support for model-embedded sampling parameters (#17120)
Aaron Teo committed
November 20, 2025
G
common : more accurate sampling timing (#17382)
Georgi Gerganov committed
November 14, 2025
X
mtmd: add mtmd_log_set (#17268)
Xuan-Son Nguyen committed
November 8, 2025
X
arg: add --cache-list argument to list cached models (#17073)
Xuan-Son Nguyen committed
October 6, 2025
G
llama : add --no-host to disable host buffers (#16310)
Gadflyii committed
September 26, 2025
A
devops: add s390x & ppc64le CI (#15925)
Aaron Teo committed
September 25, 2025
D
llama : add support for qwen3 reranker (#15824)
Douglas Hanley committed
September 24, 2025
U
common : add missing chrono header for common.cpp (#16211)
Uilian Ries committed
August 30, 2025
J
llama: use FA + max. GPU layers by default (#15434)
Johannes Gäßler committed
August 28, 2025
S
model : jina-embeddings-v3 support (#13693)
Sigbjørn Skjæret committed
August 22, 2025
G
llama : remove KV cache defragmentation logic (#15473)
Georgi Gerganov committed
August 21, 2025
J
common : fix incorrect print of non-ascii characters in the logging (#15466)
Jie Fu (傅杰) committed
August 14, 2025
J
finetune: SGD optimizer, more CLI args (#13873)
Jonathan Graehl committed
July 31, 2025
D
llama : allow other bufts when overriding to CPU, add --no-repack option (#14990)
Diego Devesa committed
July 19, 2025
C
imatrix : use GGUF to store importance matrices (#9400)
compilade committed
July 16, 2025
G
llama : add high-throughput mode (#14363)
Georgi Gerganov committed
G
server : pre-calculate EOG logit biases (#14721)
Georgi Gerganov committed
June 20, 2025
R
vocab : prevent tokenizer overflow (#14301)
Ruikai Peng committed
June 19, 2025
F
build : suppress gcc15 compile warnings (#14261)
fanyang committed