Commits: tests/test-backend-ops.cpp - ggml-org/llama.cpp

ggml-org / llama.cpp UNCLAIMED

LLM inference in C/C++

0 0 0 C++

COMMITS

/ tests/test-backend-ops.cpp

master

March 30, 2026

CUDA : Fix CUB's argsort when nrows % block_size == 0 CCCL < 3.1 (#21181)

Oliver Simons committed 4d ago

64ac9ab

March 26, 2026

ggml-cuda: Add NVFP4 dp4a kernel (#20644)

Michael Wand committed 8d ago

112c781

CUDA & CPU: support F32 kernel type for `CONV_TRANSPOSE_2D` (#17094)

Yihao Wang committed 8d ago

0a524f2

March 24, 2026

metal : add FA instantiations for HSK=512, HSV=512 (#20902)

Georgi Gerganov committed 10d ago

342d612

March 14, 2026

metal : add FA specialization for HSK = 320, HSV = 256 (#20549)

Georgi Gerganov committed 20d ago

b30a5fd

March 12, 2026

test-backend-ops: allow loading tests from file and parsing model operators into file (#19896)

Ruben Ortlam committed 22d ago

128142f

vulkan: add GATED_DELTA_NET op support (#20334)

ProgenyAlpha committed 22d ago

deee238

vulkan: fix l2_norm epsilon handling (#20350)

Jeff Bolz committed 22d ago

246ffc4

March 11, 2026

llama : enable chunked fused GDN path (#20340)

Georgi Gerganov committed 23d ago

d28961d

ggml : add NVFP4 quantization type support (#19769)

Richard Davison committed 23d ago

5eae9cb

March 7, 2026

ggml: add GATED_DELTA_NET op (#19504)

Aman Gupta committed 27d ago

c5a7788

March 6, 2026

Autoparser - complete refactoring of parser architecture (#18675)

Piotr Wilkin (ilintar) committed 28d ago

566059a

CUDA: use shared mem for ssm_conv (#20128)

Aman Gupta committed 28d ago

1e38a7a

March 5, 2026

chore : correct typos [no ci] (#20041)

Marcel Petrick committed 29d ago

92f7da0

March 2, 2026

ggml-webgpu: Support non-contiguous `src0` and overlapping `src0/src1` in binary ops (#19850)

Masashi Yoshimura committed 1mo ago

36a7a65

February 20, 2026

test: mul_mat tests with huge batch size (#19519)

Jeff Bolz committed 1mo ago

77d6ae4

February 15, 2026

ggml : avoid UB in gemm ukernel (#19642)

Georgi Gerganov committed 1mo ago

08e6d91

February 14, 2026

vulkan: support L2_NORM with contiguous rows (#19604)

Jeff Bolz committed 1mo ago

dbb0233

February 13, 2026

fix vulkan ggml_acc only works in 3d but not 4d (#19426)

ymcki committed 1mo ago

0e21991

metal : support GGML_OP_SET (#19548)

Georgi Gerganov committed 1mo ago

490eb96

February 12, 2026

metal : update sum_rows kernel to support float4 (#19524)

Georgi Gerganov committed 1mo ago

3b3a948

February 11, 2026

ggml : unary ops support non-cont src0 + metal F16 unary ops (#19511)

Georgi Gerganov committed 1mo ago

914dde7

ggml : extend bin bcast for permuted src1 (#19484)

Georgi Gerganov committed 1mo ago

89181c0

metal : consolidate unary ops (#19490)

Georgi Gerganov committed 1mo ago

ceaa89b

February 10, 2026

test: fix IMROPE perf test case (#19465)

Xuan-Son Nguyen committed 1mo ago

9a96352

cuda : extend GGML_OP_PAD to work with non-cont src0 (#19429)

Georgi Gerganov committed 1mo ago

a0d5855

February 6, 2026

tests: reduce number of FA test permutations (#19381)

Jeff Bolz committed 1mo ago

db6adb3

February 5, 2026

vulkan: Preprocess FA mask to detect all-neg-inf and all-zero. (#19281)

Jeff Bolz committed 1mo ago

449ec2a

February 4, 2026

tests : add non-cont, inplace rope tests (#19296)

Georgi Gerganov committed 1mo ago

eaba92c

February 2, 2026

ggml-cpu: FA split across kv for faster TG (#19209)

Aman Gupta committed 2mo ago

9f682fb

January 30, 2026

tests : add GQA=20 FA test (#19095)

Georgi Gerganov committed 2mo ago

c3b87ce

January 26, 2026

CUDA: fix padding of GQA to power of 2 in FA (#19115)

Johannes Gäßler committed 2mo ago

b0311c1

January 22, 2026

mla : make the V tensor a view of K (#18986)

Georgi Gerganov committed 2mo ago

a5eaa1d

January 21, 2026

vulkan: support flash attention GQA/split_k with small batches (#18938)

Jeff Bolz committed 2mo ago

33f890e

January 16, 2026

ggml : extend ggml_pool_1d + metal (#16429)

Thore Koritzius committed 2mo ago

388ce82

January 15, 2026

CUDA: Factor out and re-use `block_reduce` function (#18785)

Oliver Simons committed 2mo ago

36f0132

January 12, 2026

vulkan: Use VK_EXT_shader_64bit_indexing to handle large mat_mul(_id) (#18678)

Jeff Bolz committed 2mo ago

2bbe4c2

January 10, 2026

test-backend-ops: fix mxfp4 tests on blackwell (#18736)

Aman Gupta committed 2mo ago

b137718

January 5, 2026

vulkan: fix topk_moe_sigmoid_norm_bias failures in GLM-4.6 (#18582)

Jeff Bolz committed 2mo ago

f1768d8

vulkan: handle quantize_q8_1 overflowing the max workgroup count (#18515)

Jeff Bolz committed 2mo ago

b37124d

CANN: add operator fusion support for ADD + RMS_NORM (#17512)

Chenguang Li committed 2mo ago

67e3f6f

January 4, 2026

sampling : add support for backend sampling (#17004)

Daniel Bevenius committed 2mo ago

d3dce4e

January 1, 2026

vulkan: extend topk_moe to handle sigmoid w/exp_probs_b for nemotron (#18295)

Jeff Bolz committed 3mo ago

be47fb9

December 26, 2025

vulkan: Support UPSCALE w/antialias (#18327)

Jeff Bolz committed 3mo ago

b96b82f

vulkan: handle rope with large number of rows (#18306)

Jeff Bolz committed 3mo ago

10dc500

December 22, 2025

vulkan: Extend rope fusions to allow mrope (#18264)

Jeff Bolz committed 3mo ago

e3b35dd

December 21, 2025

vulkan: fix im2col overflowing maxworkgroupcount (#18180)

Jeff Bolz committed 3mo ago

fd05c51

vulkan/cuda: fix topk_moe with exp_probs_b (#18071)

Jeff Bolz committed 3mo ago

b365c3f

December 20, 2025

tests: Avoid floating point precision false positives in SUM (#17471)

Jeff Bolz committed 3mo ago

52ab19d

test-backend-ops: improve msvc build time (#18209)

Jeff Bolz committed 3mo ago

5182dd6