COMMITS
/ common/arg.cpp April 3, 2026
Y
server: save and clear idle slots on new task (`--clear-idle`) (#20993)
Yes You Can Have Your Own committed
April 2, 2026
R
tests: allow exporting graph ops from HF file without downloading weights (#21182)
Ruben Ortlam committed
March 28, 2026
A
server : add custom socket options to disable SO_REUSEPORT (#21056)
Adrien Gallouët committed
March 27, 2026
A
server: remove the verbose_prompt parameter (#21059)
AN Long committed
X
server: add built-in tools backend support (#20898)
Xuan-Son Nguyen committed
March 25, 2026
A
common : fix verbosity setup (#20989)
Adrien Gallouët committed
March 24, 2026
A
common : add standard Hugging Face cache support (#20775)
Adrien Gallouët committed
March 21, 2026
D
misc : prefer ggml-org models in docs and examples (#20827)
ddh0 committed
March 19, 2026
P
common/parser: add proper reasoning tag prefill reading (#20424)
Piotr Wilkin (ilintar) committed
D
common : add LLAMA_ARG_SPEC_TYPE (#20744)
ddh0 committed
March 17, 2026
P
common/parser: add `--skip-chat-parsing` to force a pure content parser. (#20289)
Piotr Wilkin (ilintar) committed
March 12, 2026
R
D
common : update completion executables list [no ci] (#19934)
Daniel Bevenius committed
March 11, 2026
P
common/parser: handle reasoning budget (#20297)
Piotr Wilkin (ilintar) committed
March 10, 2026
S
common : fix incorrect uses of stoul (#20313)
Sigbjørn Skjæret committed
March 8, 2026
J
llama: end-to-end tests (#19802)
Johannes Gäßler committed
March 6, 2026
P
Checkpoint every n tokens: squash (#20087)
Piotr Wilkin (ilintar) committed
A
webui: Agentic Loop + MCP Client with support for Tools, Resources and Prompts (#18655)
Aleksander Grygier committed
March 5, 2026
M
chore : correct typos [no ci] (#20041)
Marcel Petrick committed
February 27, 2026
February 25, 2026
D
common : add more aliases for sampler CLI params (#19797)
ddh0 committed
February 12, 2026
G
args : add -kvu to llama-parallel (#19577)
Georgi Gerganov committed
February 9, 2026
S
spec : remove check rate (#19377)
Sascha Rogmann committed
January 30, 2026
G
spec : add ngram-mod (#19164)
Georgi Gerganov committed
January 29, 2026
G
arg : add -kvu to llama-batched-bench (#19172)
Georgi Gerganov committed
January 28, 2026
S
spec : add self‑speculative decoding (no draft model required) + refactor (#18471)
Sascha Rogmann committed
G
cuda : fix "V is K view" check for non-unified KV cache (#19145)
Georgi Gerganov committed
G
llama : disable Direct IO by default (#19109)
Georgi Gerganov committed
January 25, 2026
D
common : use two decimal places for float arg help messages (#19048)
Daniel Bevenius committed
January 24, 2026
J
llama-fit-params: keep explicit --ctx-size 0 (#19070)
Johannes Gäßler committed
January 15, 2026
D
llama : add adaptive-p sampler (#17927)
ddh0 committed
January 14, 2026
A
refactor : remove libcurl, use OpenSSL when available (#18828)
Adrien Gallouët committed
January 12, 2026
R
server : add arg for disabling prompt caching (#18776)
Radoslav Gerganov committed
D
examples : add --kv-unified to batched example (#18774)
Daniel Bevenius committed
January 10, 2026
X
preset: allow named remote preset (#18728)
Xuan-Son Nguyen committed
A
common : add --license to display embedded licenses (#18696)
Adrien Gallouët committed
January 8, 2026
X
common: support remote preset (#18520)
Xuan-Son Nguyen committed
J
llama-fit-params: free memory target per device (#18679)
Johannes Gäßler committed
J
llama : add `use_direct_io` flag for model loading (#18166)
Julius Tischbein committed
January 7, 2026
A
tools : remove llama-run (#18661)
Adrien Gallouët committed
D
examples : add debug utility/example (#18464)
Daniel Bevenius committed
January 6, 2026
X
arg: use CSV escape style for multiple-value args (#18643)
Xuan-Son Nguyen committed
January 4, 2026
D
sampling : add support for backend sampling (#17004)
Daniel Bevenius committed
December 28, 2025
O
rpc: fix segfault on invalid endpoint format (#18387)
o7si committed
December 27, 2025
J
llama: fix magic number of 999 for GPU layers (#18266)
Johannes Gäßler committed
December 24, 2025
X
server: (router) add stop-timeout option (#18350)
Xuan-Son Nguyen committed
December 21, 2025
X
server: add auto-sleep after N seconds of idle (#18228)
Xuan-Son Nguyen committed
December 20, 2025
X
server: support load model on startup, support preset-only options (#18206)
Xuan-Son Nguyen committed
December 19, 2025
P
arg: fix order to use short form before long form (#18196)
Pascal committed