mudler / LocalAI UNCLAIMED

LocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.

0 0 70 Go

fix(tests/e2e-backends): bump ctx_size for llama-cpp transcription

Qwen3-ASR-0.6B encodes the jfk.wav fixture into 777 audio tokens via
its mmproj, but the test harness defaulted BACKEND_TEST_CTX_SIZE to
512, so llama.cpp server rejected every transcription request with
"request (777 tokens) exceeds the available context size (512 tokens)".

Set BACKEND_TEST_CTX_SIZE=2048 on the llama-cpp transcription target
only — sherpa-onnx and vibevoice transcription targets don't go
through llama.cpp's slot/n_ctx and weren't failing.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]

Ettore Di Giacinto committed 1mo ago

3bc5ae8da694bcfb3b374d8abb87915b8b8905de

Parent: 3234e6d