# Adding a New Backend When adding a new backend to LocalAI, you need to update several files to ensure the backend is properly built, tested, and registered. Here's a step-by-step guide based on the pattern used for adding backends like `moonshine`: ## 1. Create Backend Directory Structure Create the backend directory under the appropriate location: - **Python backends**: `backend/python//` - **Go backends**: `backend/go//` - **C++ backends**: `backend/cpp//` - **Rust backends**: `backend/rust//` For Python backends, you'll typically need: - `backend.py` - Main gRPC server implementation - `Makefile` - Build configuration - `install.sh` - Installation script for dependencies - `protogen.sh` - Protocol buffer generation script - `requirements.txt` - Python dependencies - `run.sh` - Runtime script - `test.py` / `test.sh` - Test files For Rust backends, you'll typically need (see `backend/rust/kokoros/` as a reference): - `Cargo.toml` - Crate manifest; depend on the upstream project as a submodule under `sources/` - `build.rs` - Invokes `tonic_build` to generate gRPC stubs from `backend/backend.proto` (use the `BACKEND_PROTO_PATH` env var so the Makefile can inject the canonical copy) - `src/` - The gRPC server implementation (implement `Backend` via `tonic`) - `Makefile` - Copies `backend.proto` into the crate, runs `cargo build --release`, then `package.sh` - `package.sh` - Uses `ldd` to bundle the binary's dynamic deps and `ld.so` into `package/lib/` - `run.sh` - Sets `LD_LIBRARY_PATH`/`SSL_CERT_DIR` and execs the binary via the bundled `lib/ld.so` - `sources//` - Git submodule with the upstream Rust crate ## 2. Add Build Configurations to `.github/backend-matrix.yml` The build matrix is data-only YAML at `.github/backend-matrix.yml` (not inside `backend.yml` itself). `backend.yml` (master push) and `backend_pr.yml` (PR) load it via `scripts/changed-backends.js`, which also handles per-file path filtering so only touched backends rebuild on PRs and master pushes alike. Add build matrix entries to `.github/backend-matrix.yml` for each platform/GPU type you want to support. Look at similar backends for reference — `chatterbox`/`faster-whisper` for Python, `piper`/`silero-vad` for Go, `kokoros` for Rust. **Without an entry here no image is ever built or pushed, and the gallery entry in `backend/index.yaml` will point at a tag that does not exist.** The `dockerfile:` field must point at `./backend/Dockerfile.` matching the language bucket from step 1 (e.g. `Dockerfile.python`, `Dockerfile.golang`, `Dockerfile.rust`). The `tag-suffix` must match the `uri:` in the corresponding `backend/index.yaml` image entry exactly. **`scripts/changed-backends.js` registration — REQUIRED for any new dockerfile suffix.** This is the single most common omission, because it has no effect on the PR that adds the backend (when no prior path filter could catch it anyway) — it only breaks the *next* PR that touches your backend's directory, which then gets zero CI jobs and looks broken for unrelated reasons. Edit `scripts/changed-backends.js:inferBackendPath` and add a branch BEFORE the more-generic suffixes: ```js if (item.dockerfile.endsWith("")) { return `backend/cpp//`; // or backend/python|go|rust/... } ``` The `endsWith()` test is against the matrix entry's `dockerfile:` value (e.g. `./backend/Dockerfile.ds4` → `endsWith("ds4")`). Specificity order matters here just like it does for importers: more-specific suffixes go BEFORE more-generic ones (e.g. `ds4` before `llama-cpp` even though both end with letters, because some upstream might one day call itself `super-ds4-llama-cpp`). Verify locally before pushing: ```bash # Confirm your dockerfile suffix is unique enough node -e " const yaml = require('js-yaml'); const fs = require('fs'); const m = yaml.load(fs.readFileSync('.github/backend-matrix.yml','utf8')); for (const e of m.include.filter(e => e.backend === '')) { console.log(e.dockerfile, '->', e.dockerfile.endsWith('')); }" ``` A quick way to find the right insertion point: `grep -n 'item.dockerfile.endsWith' scripts/changed-backends.js`. **`bump_deps.yaml` registration — REQUIRED for any backend pinning an upstream commit.** If your backend's Makefile has a `*_VERSION?=` pin to a third-party repo, the daily auto-bump bot at `.github/workflows/bump_deps.yaml` won't notice it unless you register the backend in its matrix. The bot runs `.github/bump_deps.sh` which `grep`s for `^$VAR?=` in the Makefile you list — so the pin MUST live in the Makefile (not in a separate shell script). The bump for ds4 (#9761) had to walk this back because the original landed the pin in `prepare.sh`, which the bot can't see. Pattern (for `antirez/ds4`): ```yaml # .github/workflows/bump_deps.yaml matrix: include: - repository: "antirez/ds4" variable: "DS4_VERSION" branch: "main" file: "backend/cpp/ds4/Makefile" ``` And the corresponding Makefile shape (mirror `backend/cpp/llama-cpp/Makefile`): ```makefile DS4_VERSION?=ae302c2fa18cc6d9aefc021d0f27ae03c9ad2fc0 DS4_REPO?=https://github.com/antirez/ds4 ... ds4: mkdir -p ds4 cd ds4 && git init -q && \ git remote add origin $(DS4_REPO) && \ git fetch --depth 1 origin $(DS4_VERSION) && \ git checkout FETCH_HEAD ``` If you have a `prepare.sh` doing the clone, delete it — the recipe belongs in the Makefile target so `make purge && make` works as a clean-and-rebuild and so the bump bot finds the pin. **Placement in file:** - CPU builds: Add after other CPU builds (e.g., after `cpu-chatterbox`) - CUDA 12 builds: Add after other CUDA 12 builds (e.g., after `gpu-nvidia-cuda-12-chatterbox`) - CUDA 13 builds: Add after other CUDA 13 builds (e.g., after `gpu-nvidia-cuda-13-chatterbox`) **Additional build types you may need:** - ROCm/HIP: Use `build-type: 'hipblas'` with `base-image: "rocm/dev-ubuntu-24.04:7.2.1"` - Intel/SYCL: Use `build-type: 'intel'` or `build-type: 'sycl_f16'`/`sycl_f32` with `base-image: "intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04"` - L4T (ARM): Use `build-type: 'l4t'` with `platforms: 'linux/arm64'` and `runs-on: 'ubuntu-24.04-arm'` **Per-arch native builds (`linux/amd64` + `linux/arm64`):** Multi-arch backends are NOT a single matrix entry with `platforms: 'linux/amd64,linux/arm64'`. Instead, add **two** entries — one with `platforms: 'linux/amd64'` + `platform-tag: 'amd64'` + `runs-on: 'ubuntu-latest'`, one with `platforms: 'linux/arm64'` + `platform-tag: 'arm64'` + `runs-on: 'ubuntu-24.04-arm'` — both sharing the same `tag-suffix`. The script detects the shared `tag-suffix` and emits a `merge-matrix` entry, so `backend-merge-jobs` (in `backend.yml`/`backend_pr.yml`) automatically assembles the manifest list from per-arch digest artifacts. See `-cpu-faster-whisper` in `.github/backend-matrix.yml` for a reference shape. **llama-cpp / ik-llama-cpp / turboquant variants only — `builder-base-image`:** Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~25–35 min cold). Local `make backends/` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds. ## 3. Add Backend Metadata to `backend/index.yaml` **Step 3a: Add Meta Definition** Add a YAML anchor definition in the `## metas` section (around line 2-300). Look for similar backends to use as a template such as `diffusers` or `chatterbox` **Step 3b: Add Image Entries** Add image entries at the end of the file, following the pattern of similar backends such as `diffusers` or `chatterbox`. Include both `latest` (production) and `master` (development) tags. **Note on integrity:** OCI backends installed from a gallery whose `verification:` block is set are verified against a keyless-cosign policy before extraction; tarball/HTTP backends use the optional `sha256:` field. New backends do not need any extra YAML — the gallery-level `verification:` block covers every entry. See [.agents/backend-signing.md](backend-signing.md) for the producer-side CI step. ## 4. Update the Makefile The Makefile needs to be updated in several places to support building and testing the new backend: **Step 4a: Add to `.NOTPARALLEL`** Add `backends/` to the `.NOTPARALLEL` line (around line 2) to prevent parallel execution conflicts: ```makefile .NOTPARALLEL: ... backends/ ``` **Step 4b: Add to `prepare-test-extra`** Add the backend to the `prepare-test-extra` target to prepare it for testing. Use the path matching your language bucket (`backend/python/`, `backend/go/`, `backend/rust/`, …): ```makefile prepare-test-extra: protogen-python ... $(MAKE) -C backend// ``` For Rust backends the target is usually the crate build target itself (e.g. `$(MAKE) -C backend/rust/ -grpc`) so the binary is in place before `test` runs. **Step 4c: Add to `test-extra`** Add the backend to the `test-extra` target to run its tests — applies to Go and Rust backends too, not only Python: ```makefile test-extra: prepare-test-extra ... $(MAKE) -C backend// test ``` Each backend's own `Makefile` should define a `test` target so this line works regardless of language. Integration tests that need large model downloads should be gated behind an env var (see `backend/rust/kokoros/`'s `KOKOROS_MODEL_PATH` pattern) so CI only runs unit tests. **Step 4d: Add Backend Definition** Add a backend definition variable in the backend definitions section (around line 428-457). The format depends on the backend type: **For Python backends with root context** (like `faster-whisper`, `coqui`): ```makefile BACKEND_ = |python|.|false|true ``` **For Python backends with `./backend` context** (like `chatterbox`, `moonshine`): ```makefile BACKEND_ = |python|./backend|false|true ``` **For Go backends**: ```makefile BACKEND_ = |golang|.|false|true ``` **For Rust backends**: ```makefile BACKEND_ = |rust|.|false|true ``` The language field (`python`/`golang`/`rust`/…) must match a `backend/Dockerfile.` file. **Step 4e: Generate Docker Build Target** Add an eval call to generate the docker-build target (around line 480-501): ```makefile $(eval $(call generate-docker-build-target,$(BACKEND_))) ``` **Step 4f: Add to `docker-build-backends`** Add `docker-build-` to the `docker-build-backends` target (around line 507): ```makefile docker-build-backends: ... docker-build- ``` **Determining the Context:** - If the backend is in `backend/python//` and uses `./backend` as context in the workflow file, use `./backend` context - If the backend is in `backend/python//` but uses `.` as context in the workflow file, use `.` context - Check similar backends to determine the correct context ## 5. Verification Checklist After adding a new backend, verify: - [ ] Backend directory structure is complete with all necessary files - [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant) - [ ] Meta definition added to `backend/index.yaml` in the `## metas` section - [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development) - [ ] Tag suffixes match between workflow file and index.yaml - [ ] Makefile updated with all 6 required changes (`.NOTPARALLEL`, `prepare-test-extra`, `test-extra`, backend definition, docker-build target eval, `docker-build-backends`) - [ ] No YAML syntax errors (check with linter) - [ ] No Makefile syntax errors (check with linter) - [ ] Follows the same pattern as similar backends (e.g., if it's a transcription backend, follow `faster-whisper` pattern) ## Bundling runtime shared libraries (`package.sh`) The final `Dockerfile.python` stage is `FROM scratch` — there is no system `libc`, no `apt`, no fallback library path. Only files explicitly copied from the builder stage end up in the backend image. That means any runtime `dlopen` your backend (or its Python deps) needs **must** be packaged into `${BACKEND}/lib/`. Pattern: 1. Make sure the library is installed in the builder stage of `backend/Dockerfile.python` (add it to the top-level `apt-get install`). 2. Drop a `package.sh` in your backend directory that copies the library — and its soname symlinks — into `$(dirname $0)/lib`. See `backend/python/vllm/package.sh` for a reference implementation that walks `/usr/lib/x86_64-linux-gnu`, `/usr/lib/aarch64-linux-gnu`, etc. 3. `Dockerfile.python` already runs `package.sh` automatically if it exists, after `package-gpu-libs.sh`. 4. `libbackend.sh` automatically prepends `${EDIR}/lib` to `LD_LIBRARY_PATH` at run time, so anything packaged this way is found by `dlopen`. How to find missing libs: when a Python module silently fails to register torch ops or you see `AttributeError: '_OpNamespace' '...' object has no attribute '...'`, run the backend image's Python with `LD_DEBUG=libs` to see which `dlopen` failed. The filename in the error message (e.g. `libnuma.so.1`) is what you need to package. To verify packaging works without trusting the host: ```bash make docker-build- CID=$(docker create --entrypoint=/run.sh local-ai-backend:) docker cp $CID:/lib /tmp/check && docker rm $CID ls /tmp/check # expect the bundled .so files + symlinks ``` Then boot it inside a fresh `ubuntu:24.04` (which intentionally does *not* have the lib installed) to confirm it actually loads from the backend dir. ## Importer integration When you add a new backend, you MUST also make it importable via the model import form (`/import-model`). The import form dropdown is sourced dynamically from `GET /backends/known` — it reads the importer registry at `core/gallery/importers/importers.go`, so the steps below are the ONLY way to make your backend show up. Required steps: 1. **If your backend has unambiguous detection signals** (unique file extension, HF `pipeline_tag`, unique repo name pattern, unique artefact like `modules.json`): - Create an importer file at `core/gallery/importers/.go` following the Match/Import pattern in `llama-cpp.go`. - Register it in `importers.go:defaultImporters` in **specificity order** — more specific detectors must appear BEFORE more generic ones (e.g. `sentencetransformers` before `transformers`, `stablediffusion-ggml` before `llama-cpp`, `vllm-omni` before `vllm`). First match wins. 2. **If your backend is a drop-in replacement** (same artefacts as another backend, e.g. `ik-llama-cpp` and `turboquant` both consume GGUF the same way `llama-cpp` does): - Do NOT create a new importer. Extend the existing importer's `Import()` to swap the emitted `backend:` field when `preferences.backend` matches. See `llama-cpp.go` for the pattern. 3. **If your backend has no reliable auto-detect signal** (preference-only — e.g. `sglang`, `tinygrad`, `whisperx`): - Do NOT create an importer. Instead add the backend name to the curated pref-only slice in `core/http/endpoints/localai/backend.go` that feeds `/backends/known`. A single line addition. 4. **Always** add a table-driven test in `core/gallery/importers/importers_test.go` (Ginkgo/Gomega): - Use a real public HuggingFace repo URI as the test fixture (existing tests already hit the live HF API — follow that pattern). - Cover detection (auto-match without preferences), preference-override (explicit `backend:` in preferences wins), and — if the backend's modality has a common `pipeline_tag` but ambiguous artefacts — an ambiguity test asserting `errors.Is(err, importers.ErrAmbiguousImport)`. Rules of thumb: - When in doubt, lean pref-only. A wrong auto-detect is worse than a forced preference. - Never silently emit a modality mismatch (e.g. emit `llama-cpp` for a TTS repo because `.gguf` is present). Return `ErrAmbiguousImport` instead. - Registration order is the single most common source of bugs. Check by running `go test ./core/gallery/importers/...` — the existing suite will fail if you've shadowed a pre-existing detector. ## 6. Example: Adding a Python Backend For reference, when `moonshine` was added: - **Files created**: `backend/python/moonshine/{backend.py, Makefile, install.sh, protogen.sh, requirements.txt, run.sh, test.py, test.sh}` - **Workflow entries**: 3 build configurations (CPU, CUDA 12, CUDA 13) - **Index entries**: 1 meta definition + 6 image entries (cpu, cuda12, cuda13 x latest/development) - **Makefile updates**: - Added to `.NOTPARALLEL` line - Added to `prepare-test-extra` and `test-extra` targets - Added `BACKEND_MOONSHINE = moonshine|python|./backend|false|true` - Added eval for docker-build target generation - Added `docker-build-moonshine` to `docker-build-backends`