mudler / LocalAI UNCLAIMED

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

0 0 68 Go

feat(llama-cpp): expose split_mode option for multi-GPU placement (#9560)

Adds split_mode (alias sm) to the llama.cpp backend options allowlist,
accepting none|layer|row|tensor. The tensor value targets the experimental
backend-agnostic tensor parallelism from ggml-org/llama.cpp#19378 and
requires a llama.cpp build that includes that PR, FlashAttention enabled,
KV-cache quantization disabled, and a manually set context size.


Assisted-by: Claude:claude-opus-4-7

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

Ettore Di Giacinto committed 1mo ago

21eace40ecc58a1dcd02f4cef4ecbcff0bf13480

Parent: 24505e5

Committed by GitHub <noreply@github.com> on 4/25/2026, 12:02:57 PM