COMMITS
June 1, 2026
G
release : v1.8.6
Georgi Gerganov committed
D
ci : fix path to whisper.h in examples.yml [no ci] (#3842)
Daniel Bevenius committed
May 31, 2026
G
ci : fix self-hosted paths to mnt
Georgi Gerganov committed
G
pi : add config
Georgi Gerganov committed
G
ci : remove obsolete self-hosted label
Georgi Gerganov committed
G
common : pass sample rate to `ffmpeg_decode_audio()`
Georgi Gerganov committed
G
common : re-implement `ffmpeg-transcode.cpp` + clarify ffmpeg usage (#3846)
Georgi Gerganov committed
May 29, 2026
G
sync : ggml
Georgi Gerganov committed
G
ggml : bump version to 0.13.1 (ggml/1523)
Georgi Gerganov committed
G
talk-llama : sync llama.cpp
Georgi Gerganov committed
G
sync : ggml
Georgi Gerganov committed
A
cuda : disables launch_fattn PDL enrollment due to compiler bug (llama/23825)
Andreas Kieslinger committed
M
meta : Add missing `buffer` set in allreduce fallback !COMPUTE clear (llama/23480)
Matt Corallo committed
May 28, 2026
M
hexagon: basic/generic op fusion support and RMS_NORM+MUL fusion (llama/23835)
Max Krasnyansky committed
F
ggml: auto apply iGPU flag CUDA/HIP if integrated device (llama/23007)
fl0rianr committed
J
CUDA: route batch>=4 quantized matmul to MMQ on AMD MFMA hardware (llama/23227)
Jaden_Mach committed
M
hexagon: minor refresh for HMX FA and MM (llama/23796)
Max Krasnyansky committed
J
vulkan: fast path for walsh-hadamard transform (llama/23687)
Jeff Bolz committed
W
vulkan: fix wrong index variable in inner loop (llama/23665)
Winston Ma committed
W
vulkan: Fix memory logger unsafe iterator access (llama/23667)
Winston Ma committed
F
cuda : fix KQ mask offset integer overflow in fattn MMA kernel (llama/23610)
fairydreaming committed
M
ggml: fixed Arm SVE usage bug in vec.h, vec.cpp (llama/22841)
Martin Klacer committed
Y
Hexagon: OP_GATED_DELTA_NET K>1 support (llama/23531)
ymcki committed
Y
opencl: OP_GATED_DELTA_NET (llama/23312)
ymcki committed
May 27, 2026
R
ggml-webgpu: remove legacy constants (llama/23672)
Reese Levine committed
M
hexagon: add support for Q4_1 in MUL_MAT and MUL_MAT_ID (llama/23647)
Max Krasnyansky committed
M
ggml-webgpu: Fix how to dispatch WG to some ops (llama/23750)
Masashi Yoshimura committed
M
vulkan: Switch MUL_MAT_VEC to 4 K per iteration for F16/32 (llama/22887)
Matt Corallo committed