COMMITS
/ Makefile August 20, 2025
D
make : remove make in favor of CMake (#15449)
Daniel Bevenius committed
June 9, 2025
X
ggml-cpu : split arch-specific implementations (#13892)
xctan committed
May 7, 2025
G
examples : remove infill (#13283)
Georgi Gerganov committed
May 5, 2025
X
mtmd : rename llava directory to mtmd (#13311)
Xuan-Son Nguyen committed
May 2, 2025
D
llama : move end-user examples to tools directory (#13249)
Diego Devesa committed
April 15, 2025
D
CUDA/HIP: Share the same unified memory allocation logic. (#12934)
David Huang committed
March 10, 2025
R
musa: support new arch mp_31 and update doc (#12296)
R0CKSTAR committed
February 22, 2025
J
CUDA: app option to compile without FlashAttention (#12025)
Johannes Gäßler committed
February 21, 2025
B
MUSA: support ARM64 and enable dp4a .etc (#11843)
Bodhi committed
February 18, 2025
O
tool-call: refactor common chat / tool-call api (+ tests / fixes) (#11900)
Olivier Chafik committed
February 15, 2025
G
repo : update links to new url (#11886)
Georgi Gerganov committed
February 2, 2025
J
CUDA: use mma PTX instructions for FlashAttention (#11583)
Johannes Gäßler committed
January 30, 2025
January 21, 2025
O
Add Jinja template support (#11016)
Olivier Chafik committed
December 14, 2024
H
llama : add Qwen2VL support + multimodal RoPE (#10361)
HimariO committed
December 7, 2024
D
ggml : refactor online repacking (#10446)
Djip007 committed
December 3, 2024
X
server : (web ui) Various improvements, now use vite as bundler (#10599)
Xuan Son Nguyen committed
December 2, 2024
G
make : deprecate (#10514)
Georgi Gerganov committed
December 1, 2024
W
build: update Makefile comments for C++ version change (#10598)
Wang Qin committed
November 29, 2024
D
ggml : move AMX to the CPU backend (#10570)
Diego Devesa committed
November 26, 2024
T
Fix HIP flag inconsistency & build docs (#10524)
Tristan Druyen committed
R
mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516)
R0CKSTAR committed
November 25, 2024
E
Introduce llama-run (#10291)
Eric Curtin committed
D
ggml : add support for dynamic loading of backends (#10469)
Diego Devesa committed
G
speculative : refactor and add a simpler example (#10362)
Georgi Gerganov committed
November 19, 2024
A
Fix missing file renames in Makefile due to changes in commit ae8de6d50a (#10413)
Anthony Van de Gejuchte committed
November 17, 2024
G
metal : refactor kernel args into structs (#10238)
Georgi Gerganov committed
J
CUDA: remove DMMV, consolidate F16 mult mat vec (#10318)
Johannes Gäßler committed
November 16, 2024
G
make : add ggml-opt (#0)
Georgi Gerganov committed
G
tests : remove test-grad0
Georgi Gerganov committed
G
make : auto-determine dependencies (#0)
Georgi Gerganov committed
November 15, 2024
S
ggml : fix some build issues
slaren committed
C
backend cpu: add online flow for aarch64 Q4_0 GEMV/GEMM kernels (#9921)
Charles Xu committed
November 14, 2024
D
ggml : build backends as libraries (#10256)
Diego Devesa committed
November 8, 2024
G
metal : opt-in compile flag for BF16 (#10218)
Georgi Gerganov committed
November 7, 2024
X
server : revamp chat UI with vuejs and daisyui (#10175)
Xuan Son Nguyen committed
November 3, 2024
D
ggml : move CPU backend to a separate file (#10144)
Diego Devesa committed
November 1, 2024
D
llama : add simple-chat example (#10124)
Diego Devesa committed
October 18, 2024
M
add amx kernel for gemm (#8998)
Ma Mingfei committed
October 2, 2024
D
ggml-backend : add device and backend reg interfaces (#9707)
Diego Devesa committed
G
examples : remove benchmark (#9704)
Georgi Gerganov committed
September 22, 2024
September 16, 2024
G
cmake : do not hide GGML options + rename option (#9465)
Georgi Gerganov committed
September 15, 2024
G
common : reimplement logging (#9418)
Georgi Gerganov committed
September 13, 2024
X
server : add loading html page while model is loading (#9468)
Xuan Son Nguyen committed
September 12, 2024
A
riscv : modify Makefile and add a RISCV_VECT to print log info (#9442)
Ahmad Tameem committed
September 10, 2024
S
make : do not run llama-gen-docs when building (#9399)
slaren committed
September 9, 2024
X
common : move arg parser code to `arg.cpp` (#9388)
Xuan Son Nguyen committed
September 7, 2024
X
common : refactor arg parser (#9308)
Xuan Son Nguyen committed
G
llama : refactor sampling v2 (#9294)
Georgi Gerganov committed