Commits: backend/cpp/llama-cpp/grpc-server.cpp - mudler/LocalAI - Morph

SIGN IN SIGN UP

mudler / LocalAI UNCLAIMED

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

0 0 53 Go

COMMITS

/ backend/cpp/llama-cpp/grpc-server.cpp

master

May 21, 2026

L

feat(llama-cpp): make server-side prompt cache work by default (#9925)

LocalAI [bot] committed 1d ago

R

fix(llama-cpp): terminate tensor_buft_overrides with sentinel (#9919)

Richard Palethorpe committed 1d ago

May 14, 2026

L

feat(llama-cpp): expose 12 missing common_params via options[] (#9814)

LocalAI [bot] committed 8d ago

L

chore: :arrow_up: Update ggml-org/llama.cpp to `7f3f843c31cd32dc4adc10b393342dfee071c332` (#9809)

LocalAI [bot] committed 8d ago

May 12, 2026

L

feat(llama-cpp): bump to `1ec7ba0c`, adapt grpc-server, expose new spec-decoding options (#9765)

LocalAI [bot] committed 10d ago

April 30, 2026

E

feat(llama-cpp): bump to d775992 and adapt to spec params refactor (#9618)

Ettore Di Giacinto committed 22d ago

April 25, 2026

E

feat(llama-cpp): expose split_mode option for multi-GPU placement (#9560)

Ettore Di Giacinto committed 27d ago

April 23, 2026

E

fix(llama-cpp): include server-chat.cpp in grpc-server translation unit (#9511)

Ettore Di Giacinto committed 29d ago

April 18, 2026

E

fix(vision): propagate mtmd media marker from backend via ModelMetadata (#9412)

Ettore Di Giacinto committed 1mo ago

April 14, 2026

E

feat: wire transcription for llama.cpp, add streaming support (#9353)

Ettore Di Giacinto committed 1mo ago

April 10, 2026

E

fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299)

Ettore Di Giacinto committed 1mo ago

April 9, 2026

E

chore(llama.cpp): bump to 'd12cc3d1ca6bba741cd77887ac9c9ee18c8415c7' (#9282)

Ettore Di Giacinto committed 1mo ago

April 6, 2026

E

fix(chat): do not retry if we had chatdeltas or tooldeltas from backend (#9244)

Ettore Di Giacinto committed 1mo ago

April 5, 2026

E

feat(llama.cpp): wire speculative decoding settings (#9238)

Ettore Di Giacinto committed 1mo ago

April 4, 2026

E

fix(reasoning): suppress partial tag tokens during autoparser warm-up

Ettore Di Giacinto committed 1mo ago

April 3, 2026

E

fix(llama.cpp): correctly parse grpc header for bearer token auth

Ettore Di Giacinto committed 1mo ago

March 29, 2026

E

feat: add distributed mode (#9124)

Ettore Di Giacinto committed 1mo ago

March 21, 2026

E

feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092)

Ettore Di Giacinto committed 2mo ago

March 20, 2026

E

chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' (#9079)

Ettore Di Giacinto committed 2mo ago

March 12, 2026

R

fix(llama-cpp): Set enable_thinking in the correct place (#8973)

Richard Palethorpe committed 2mo ago

March 8, 2026

E

feat(functions): add peg-based parsing and allow backends to return tool calls directly (#8838)

Ettore Di Giacinto committed 2mo ago

March 5, 2026

E

feat: pass-by metadata to predict options (#8795)

Ettore Di Giacinto committed 2mo ago

February 27, 2026

E

chore(deps): bump llama.cpp to 'ecbcb7ea9d3303097519723b264a8b5f1e977028' (#8672)

Ettore Di Giacinto committed 2mo ago

February 17, 2026

R

fix(llama-cpp): Pass parameters when using embedded template (#8590)

Richard Palethorpe committed 3mo ago

February 14, 2026

A

fix(llama-cpp): populate tensor_buft_override buffer so llama-cpp properly performs fit calculations (#8560)

Austen committed 3mo ago

January 28, 2026

E

chore(llama.cpp): bump to 'f6b533d898ce84bae8d9fa8dfc6697ac087800bf' (#8275)

Ettore Di Giacinto committed 3mo ago

January 22, 2026

E

feat: detect thinking support from backend automatically if not explicitly set (#8167)

Ettore Di Giacinto committed 3mo ago

January 20, 2026

E

chore(deps): Bump llama.cpp to '1c7cf94b22a9dc6b1d32422f72a627787a4783a3' (#8136)

Ettore Di Giacinto committed 4mo ago

January 9, 2026

E

chore(llama.cpp): propagate errors during model load (#7937)

Ettore Di Giacinto committed 4mo ago

E

chore(deps): Bump llama.cpp to '480160d47297df43b43746294963476fc0a6e10f' (#7933)

Ettore Di Giacinto committed 4mo ago

January 2, 2026

E

fix(llama.cpp/mmproj): fix loading mmproj in nested sub-dirs different from model path (#7832)

Ettore Di Giacinto committed 4mo ago

December 23, 2025

E

chore(deps): Bump llama.cpp to '5b6c9bc0f3c8f55598b9999b65aff7ce4119bc15' and refactor usage of base params (#7706)

Ettore Di Giacinto committed 4mo ago

December 22, 2025

E

chore(deps): bump llama.cpp to '0e1ccf15c7b6d05c720551b537857ecf6194d420' (#7684)

Ettore Di Giacinto committed 5mo ago

December 15, 2025

E

chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)

Ettore Di Giacinto committed 5mo ago

December 14, 2025

S

fix(7355): Update llama-cpp grpc for v3 interface (#7566)

Simon Redman committed 5mo ago

December 12, 2025

E

fix(llama.cpp): handle corner cases with tool array content (#7528)

Ettore Di Giacinto committed 5mo ago

December 9, 2025

E

chore(deps/llama-cpp): bump to '2fa51c19b028180b35d316e9ed06f5f0f7ada2c1' (#7484)

Ettore Di Giacinto committed 5mo ago

December 4, 2025

E

chore(deps): bump llama.cpp to 'bde188d60f58012ada0725c6dd5ba7c69fe4dd87' (#7434)

Ettore Di Giacinto committed 5mo ago

December 1, 2025

E

chore: :arrow_up: Update ggml-org/llama.cpp to `7f8ef50cce40e3e7e4526a3696cb45658190e69a` (#7402)

Ettore Di Giacinto committed 5mo ago

November 29, 2025

E

chore(deps): bump llama.cpp to 'd82b7a7c1d73c0674698d9601b1bbb0200933f29' (#7392)

Ettore Di Giacinto committed 5mo ago

November 26, 2025

E

chore(deps): bump llama.cpp to '583cb83416467e8abf9b37349dcf1f6a0083745a (#7358)

Ettore Di Giacinto committed 5mo ago

November 21, 2025

E

fix(llama.cpp): handle corner cases with tool content (#7324)

Ettore Di Giacinto committed 6mo ago

November 16, 2025

E

feat: add support to logitbias and logprobs (#7283)

Ettore Di Giacinto committed 6mo ago

November 14, 2025

E

fix: handle tool errors (#7271)

Ettore Di Giacinto committed 6mo ago

E

chore(deps): bump llama.cpp to `c4abcb2457217198efdd67d02675f5fddb7071c2` (#7266)

Ettore Di Giacinto committed 6mo ago

November 12, 2025

E

feat: import models via URI (#7245)

Ettore Di Giacinto committed 6mo ago

M

fix(reranker): llama-cpp sort score desc, crop top_n (#7211)

Mikhail Khludnev committed 6mo ago

November 9, 2025

E

feat: respect context and add request cancellation (#7187)

Ettore Di Giacinto committed 6mo ago

November 7, 2025

E

feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)

Ettore Di Giacinto committed 6mo ago

November 2, 2025

E

feat(llama.cpp): allow to set cache-ram and ctx_shift (#7009)

Ettore Di Giacinto committed 6mo ago