COMMITS
/ core/backend/llm.go May 30, 2026
L
feat: prefix-cache-aware routing for distributed mode (#10071)
LocalAI [bot] committed
May 25, 2026
L
feat(distributed): gated X-LocalAI-Node response header (middleware + wrapper) (#9976)
LocalAI [bot] committed
May 23, 2026
L
fix(traces): cap backend trace Data to keep admin UI responsive (#9960)
LocalAI [bot] committed
May 18, 2026
R
feat(gallery): verify backend OCI images with keyless cosign (#9823)
Richard Palethorpe committed
April 21, 2026
L
Respect explicit reasoning config during GGUF thinking probe (#9463)
leinasi2014 committed
April 18, 2026
E
fix(vision): propagate mtmd media marker from backend via ModelMetadata (#9412)
Ettore Di Giacinto committed
April 4, 2026
E
feat(autoparser): prefer chat deltas from backends when emitted (#9224)
Ettore Di Giacinto committed
March 29, 2026
E
feat: add distributed mode (#9124)
Ettore Di Giacinto committed
March 18, 2026
R
feat(ui): Per model backend logs and various fixes (#9028)
Richard Palethorpe committed
March 16, 2026
E
chore: refactor endpoints to use same inferencing path, add automatic retrial mechanism in case of errors (#9029)
Ettore Di Giacinto committed
March 13, 2026
R
feat(realtime): WebRTC support (#8790)
Richard Palethorpe committed
March 8, 2026
E
feat(functions): add peg-based parsing and allow backends to return tool calls directly (#8838)
Ettore Di Giacinto committed
March 5, 2026
E
feat: pass-by metadata to predict options (#8795)
Ettore Di Giacinto committed
February 20, 2026
R
feat(traces): Add backend traces (#8609)
Richard Palethorpe committed
January 22, 2026
E
feat: detect thinking support from backend automatically if not explicitly set (#8167)
Ettore Di Giacinto committed
December 21, 2025
E
chore(refactor): move logging to common package based on slog (#7668)
Ettore Di Giacinto committed
December 12, 2025
E
feat(loader): enhance single active backend to support LRU eviction (#7535)
Ettore Di Giacinto committed
November 16, 2025
E
feat: add support to logitbias and logprobs (#7283)
Ettore Di Giacinto committed
November 13, 2025
E
feat(ui): allow to cancel ops (#7264)
Ettore Di Giacinto committed
November 7, 2025
E
feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
Ettore Di Giacinto committed
August 28, 2025
E
fix: register backends to model-loader during installation (#6159)
Ettore Di Giacinto committed
August 14, 2025
E
feat(backends): add system backend, refactor (#6059)
Ettore Di Giacinto committed
June 29, 2025
E
fix(gallery): automatically install model from name (#5757)
Ettore Di Giacinto committed
June 27, 2025
E
feat(gallery): automatically install missing backends along models (#5736)
Ettore Di Giacinto committed
May 31, 2025
E
fix(streaming): stream complete runes (#5539)
Ettore Di Giacinto committed
May 25, 2025
R
feat: Realtime API support reboot (#5392)
Richard Palethorpe committed
April 1, 2025
E
feat(loader): enhance single active backend by treating as singleton (#5107)
Ettore Di Giacinto committed
March 2, 2025
E
feat: allow to specify a reply prefix (#4931)
Ettore Di Giacinto committed
February 10, 2025
D
feat: Centralized Request Processing middleware (#3847)
Dave committed
January 17, 2025
M
feat: add machine tag and inference timings (#4577)
mintyleaf committed
December 18, 2024
M
feat: stream tokens usage (#4415)
mintyleaf committed
December 8, 2024
E
Revert "feat: include tokens usage for streamed output" (#4336)
Ettore Di Giacinto committed
November 28, 2024
M
feat: include tokens usage for streamed output (#4282)
mintyleaf committed
November 8, 2024
E
chore(refactor): drop unnecessary code in loader (#4096)
Ettore Di Giacinto committed
October 17, 2024
E
feat(templates): extract text from multimodal requests (#3866)
Ettore Di Giacinto committed
October 2, 2024
E
feat: track internally started models by ID (#3693)
Ettore Di Giacinto committed
September 22, 2024
S
feat: auto load into memory on startup (#3627)
Sertaç Özercan committed
September 19, 2024
E
feat(api): allow to pass audios to backends (#3603)
Ettore Di Giacinto committed
E
feat(api): allow to pass videos to backends (#3601)
Ettore Di Giacinto committed
September 13, 2024
E
feat: extract output with regexes from LLMs (#3491)
Ettore Di Giacinto committed
August 24, 2024
D
feat: elevenlabs `sound-generation` api (#3355)
Dave committed
July 10, 2024
D
feat: HF `/scan` endpoint (#2566)
Dave committed
June 24, 2024
E
refactor: gallery inconsistencies (#2647)
Ettore Di Giacinto committed
June 23, 2024
S
chore: fix go.mod module (#2635)
Sertaç Özercan committed
June 13, 2024
E
feat(gallery): uniform download from CLI (#2559)
Ettore Di Giacinto committed
April 17, 2024
E
Revert #1963 (#2056)
Ettore Di Giacinto committed
April 15, 2024
E
feat(grpc): return consumed token count and update response accordingly (#2035)
Ettore Di Giacinto committed
April 13, 2024
D
April 11, 2024
L
feat: use tokenizer.apply_chat_template() in vLLM (#1990)
Ludovic Leroux committed
March 13, 2024
E
fix(config): set better defaults for inferencing (#1822)
Ettore Di Giacinto committed