COMMITS
/ tools/server/README.md April 18, 2026
C
server: Expose `media_tag` on /props endpoint. (#22028)
Cetarthoriphros committed
April 3, 2026
Y
server: save and clear idle slots on new task (`--clear-idle`) (#20993)
Yes You Can Have Your Own committed
March 28, 2026
W
Document custom default webui preferences in server README (#19771)
Woof Dog committed
A
server : add custom socket options to disable SO_REUSEPORT (#21056)
Adrien Gallouët committed
March 27, 2026
A
server: remove the verbose_prompt parameter (#21059)
AN Long committed
X
server: add built-in tools backend support (#20898)
Xuan-Son Nguyen committed
March 23, 2026
E
docs : rerun llama-gen-docs to include new CLI args (#20892)
Eric Zhang committed
March 22, 2026
X
server: allow router to report child instances sleep status (#20849)
Xuan-Son Nguyen committed
March 21, 2026
D
misc : prefer ggml-org models in docs and examples (#20827)
ddh0 committed
March 19, 2026
T
docs: Update server README to reflect PR #20297 (#20560)
Tomeamis committed
P
common/parser: add proper reasoning tag prefill reading (#20424)
Piotr Wilkin (ilintar) committed
February 27, 2026
February 26, 2026
Y
server : fix typo in server README.md (#19900)
yggdrasil75 committed
February 12, 2026
R
server : fix typo in README.md for features list (#19510)
RichardScottOZ committed
January 25, 2026
D
common : use two decimal places for float arg help messages (#19048)
Daniel Bevenius committed
January 22, 2026
X
server : support preserving reasoning_content in assistant message (#18994)
Xuan-Son Nguyen committed
January 21, 2026
손
server: /v1/responses (partial) (#18486)
손희준 committed
January 15, 2026
D
llama : add adaptive-p sampler (#17927)
ddh0 committed
January 12, 2026
X
server: update docs for sleeping [no ci] (#18777)
Xuan-Son Nguyen committed
December 24, 2025
X
server: (router) add stop-timeout option (#18350)
Xuan-Son Nguyen committed
December 22, 2025
X
gen-docs: automatically update markdown file (#18294)
Xuan-Son Nguyen committed
X
server: (docs) remove mention about extra_args (#18262)
Xuan-Son Nguyen committed
December 21, 2025
X
server: add auto-sleep after N seconds of idle (#18228)
Xuan-Son Nguyen committed
December 20, 2025
X
server: support load model on startup, support preset-only options (#18206)
Xuan-Son Nguyen committed
December 19, 2025
P
arg: fix order to use short form before long form (#18196)
Pascal committed
X
presets: refactor, allow cascade presets from different sources, add global section (#18169)
Xuan-Son Nguyen committed
December 17, 2025
P
server: (webui) add --webui-config (#18028)
Pascal committed
December 16, 2025
X
arg: clarify auto kvu/np being set on server (#17997)
Xuan-Son Nguyen committed
2
server: Update README.md incorrect argument (#18073)
2114L3 committed
December 12, 2025
X
common: support negated args (#17919)
Xuan-Son Nguyen committed
X
arg: add -mm and -mmu as short form of --mmproj and --mmproj-url (#17958)
Xuan-Son Nguyen committed
December 10, 2025
P
server: add presets (config) when using multiple models (#17859)
Pascal committed
December 8, 2025
X
server : add development documentation (#17760)
Xuan-Son Nguyen committed
G
server : make cache_reuse configurable per request (#17858)
Georgi Gerganov committed
December 6, 2025
X
server: support multiple generations from one prompt (OAI "n" option) (#17775)
Xuan-Son Nguyen committed
December 1, 2025
X
server: introduce API for serving / loading / unloading multiple models (#17470)
Xuan-Son Nguyen committed
X
common: improve verbosity level definitions (#17630)
Xuan-Son Nguyen committed
November 28, 2025
F
server : add Anthropic Messages API support (#17570)
Fredrik Hultin committed
November 27, 2025
X
server: enable jinja by default, update docs (#17524)
Xuan-Son Nguyen committed
November 8, 2025
A
November 5, 2025
손
docs: Clarify the endpoint that webui uses (#17001)
손희준 committed
October 30, 2025
G
server : remove n_past (#16818)
Georgi Gerganov committed
October 8, 2025
P
October 7, 2025
G
server : add `/v1/health` endpoint (#16461)
Georgi Gerganov committed
October 6, 2025
O
server: update readme to mention n_past_max metric (#16436)
Oleksandr Kuvshynov committed
September 28, 2025
I
Fixed a few typos in the README of the LLaMA.cpp HTTP Server [no ci] (#16297)
Imad Saddik committed
September 27, 2025
A
server : remove old LLAMA_SERVER_SSL (#16290)
Adrien Gallouët committed
September 6, 2025
X
server : implement prompt processing progress report in stream mode (#15827)
Xuan-Son Nguyen committed
August 31, 2025
G
server : enable /slots by default and make it secure (#15630)
Georgi Gerganov committed
August 29, 2025
S
server : removed obsolete doc (#15670)
Sergey Alirzaev committed