COMMITS
/ common/arg.cpp June 1, 2026
G
speculative : fix n_outputs_max and remove draft-simple auto-enable (#23988)
Georgi Gerganov committed
May 29, 2026
X
download: add option to skip_download (#23059)
Xuan-Son Nguyen committed
A
app : move licences to llama-app (#23824)
Adrien Gallouët committed
May 28, 2026
M
arg: Add LLAMA_ARG_API_KEY_FILE environment variable for --api-key-file (#23167)
Mikolaj Kucharski committed
May 27, 2026
G
common : fix env names to all have LLAMA_ARG_ prefix (#23778)
Georgi Gerganov committed
May 25, 2026
J
server: fix checkpoints creation (#22929)
jacekpoplawski committed
May 20, 2026
G
Move to backend sampling for MTP draft path (#23287)
Gaurav Garg committed
May 19, 2026
J
common: fix --help for --verbosity (#23278)
Johannes Gäßler committed
G
llama : MTP clean-up (#23269)
Georgi Gerganov committed
May 18, 2026
A
common : remove hf cache migration (#23266)
Adrien Gallouët committed
May 17, 2026
R
server : honor --embd-normalize CLI arg (#23125)
Rares Vernica committed
May 16, 2026
A
llama + spec: MTP Support (#22673)
Aman Gupta committed
A
ui: Restructure repo to use `tools/ui` folder and `ui` / `UI` / `llama-ui` / `LLAMA_UI` naming (#23064)
Aleksander Grygier committed
May 14, 2026
G
logs : reduce (#23021)
Georgi Gerganov committed
May 13, 2026
X
download: do not exit() on error (#23008)
Xuan-Son Nguyen committed
G
spec : update CLI arguments for better consistency (#22964)
Georgi Gerganov committed
May 12, 2026
X
mtmd, server, common: expose modalities to /v1/models (#22952)
Xuan-Son Nguyen committed
May 11, 2026
G
spec : parallel drafting support (#22838)
Georgi Gerganov committed
May 5, 2026
A
common : fix missing-noreturn warnings when compiling with clang 21 (#22702)
Adrien Gallouët committed
A
common : only load backends when required (#22290)
Adrien Gallouët committed
May 4, 2026
S
examples: refactor diffusion generation (#22590)
Shakhnazar Sailaukan committed
E
server: Add a simple get_datetime server tool (#22649)
Evan Huus committed
G
docs : update speculative decoding parameters after refactor (#22397) (#22539)
Georgi Gerganov committed
April 30, 2026
B
spec: fix argument typo (#22552)
Ben Guidarelli committed
April 28, 2026
G
spec : refactor params (#22397)
Georgi Gerganov committed
April 22, 2026
E
common: Refactoring sampler parameters (#20429) (#22233)
Ethan Turner committed
April 21, 2026
G
arg : add --spec-default (#22223)
Georgi Gerganov committed
G
fit-params : refactor + add option to output estimated memory per device (#22171)
Georgi Gerganov committed
April 20, 2026
G
server : refactor "use checkpoint" logic (#22114)
Georgi Gerganov committed
Y
server: rename --clear-idle to --cache-idle-slots (#21741)
Yes You Can Have Your Own committed
April 17, 2026
G
libs : rename libcommon -> libllama-common (#21936)
Georgi Gerganov committed
April 10, 2026
A
common : add callback interface for download progress (#21735)
Adrien Gallouët committed
J
common: mark --split-mode tensor as experimental (#21684)
Johannes Gäßler committed
April 9, 2026
J
ggml: backend-agnostic tensor parallelism (experimental) (#19378)
Johannes Gäßler committed
April 3, 2026
Y
server: save and clear idle slots on new task (`--clear-idle`) (#20993)
Yes You Can Have Your Own committed
April 2, 2026
R
tests: allow exporting graph ops from HF file without downloading weights (#21182)
Ruben Ortlam committed
March 28, 2026
A
server : add custom socket options to disable SO_REUSEPORT (#21056)
Adrien Gallouët committed
March 27, 2026
A
server: remove the verbose_prompt parameter (#21059)
AN Long committed
X
server: add built-in tools backend support (#20898)
Xuan-Son Nguyen committed
March 25, 2026
A
common : fix verbosity setup (#20989)
Adrien Gallouët committed
March 24, 2026
A
common : add standard Hugging Face cache support (#20775)
Adrien Gallouët committed
March 21, 2026
D
misc : prefer ggml-org models in docs and examples (#20827)
ddh0 committed
March 19, 2026
P
common/parser: add proper reasoning tag prefill reading (#20424)
Piotr Wilkin (ilintar) committed
D
common : add LLAMA_ARG_SPEC_TYPE (#20744)
ddh0 committed
March 17, 2026
P
common/parser: add `--skip-chat-parsing` to force a pure content parser. (#20289)
Piotr Wilkin (ilintar) committed
March 12, 2026
R
D
common : update completion executables list [no ci] (#19934)
Daniel Bevenius committed
March 11, 2026
P
common/parser: handle reasoning budget (#20297)
Piotr Wilkin (ilintar) committed
March 10, 2026
S
common : fix incorrect uses of stoul (#20313)
Sigbjørn Skjæret committed
March 8, 2026
J
llama: end-to-end tests (#19802)
Johannes Gäßler committed