COMMITS
/ docs/content/features/text-generation.md May 26, 2026
L
May 21, 2026
L
feat(llama-cpp): make server-side prompt cache work by default (#9925)
LocalAI [bot] committed
May 7, 2026
R
feat(sglang): wire engine_args, add cuda13 build, ship MTP gallery demos (#9686)
Richard Palethorpe committed
May 6, 2026
L
fix: unbreak master CI (docs, kokoros, vibevoice-cpp ABI) (#9682)
LocalAI [bot] committed
May 5, 2026
R
feat(vllm, distributed): tensor parallel distributed workers (#9612)
Richard Palethorpe committed
April 28, 2026
R
feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563)
Richard Palethorpe committed
April 25, 2026
E
feat(llama-cpp): expose split_mode option for multi-GPU placement (#9560)
Ettore Di Giacinto committed
April 14, 2026
E
feat(backend): add turboquant llama.cpp-fork backend (#9355)
Ettore Di Giacinto committed
April 12, 2026
E
feat(backends): add ik-llama-cpp (#9326)
Ettore Di Giacinto committed
April 3, 2026
E
fix(docs): fix broken references to distributed mode
Ettore Di Giacinto committed
March 5, 2026
E
feat: pass-by metadata to predict options (#8795)
Ettore Di Giacinto committed
January 24, 2026
E
chore(exllama): drop backend now almost deprecated (#8186)
Ettore Di Giacinto committed
January 20, 2026
E
chore(docs): update docs with Anthropic API and openresponses
Ettore Di Giacinto committed
December 15, 2025
E
chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Ettore Di Giacinto committed
November 19, 2025
E
feat: docs revamp (#7313)
Ettore Di Giacinto committed
January 18, 2024
E
docs/examples: enhancements (#1572)
Ettore Di Giacinto committed
November 22, 2023
E
docs: Initial import from localai-website (#1312)
Ettore Di Giacinto committed