Commits: examples/speculative-simple/speculative-simple.cpp - ggml-org/llama.cpp - Morph

SIGN IN SIGN UP

ggml-org / llama.cpp UNCLAIMED

LLM inference in C/C++

0 0 61 C++

COMMITS

/ examples/speculative-simple/speculative-simple.cpp

master

May 11, 2026

G

spec : parallel drafting support (#22838)

Georgi Gerganov committed 1mo ago

April 28, 2026

G

spec : refactor params (#22397)

Georgi Gerganov committed 1mo ago

April 22, 2026

G

speculative-simple : add checkpoint support (#22227)

Georgi Gerganov committed 2mo ago

March 31, 2026

A

common : move up common_init() and fix Windows UTF-8 logs (#21176)

Adrien Gallouët committed 2mo ago

March 4, 2026

S

Fix locale-dependent float printing in GGUF metadata (#17331)

SamareshSingh committed 3mo ago

January 28, 2026

S

spec : add self‑speculative decoding (no draft model required) + refactor (#18471)

Sascha Rogmann committed 4mo ago

December 14, 2025

G

common : refactor common_sampler + grammar logic changes (#17937)

Georgi Gerganov committed 6mo ago

December 13, 2025

G

speculative-simple : free batch on exit (#17985)

Georgi Gerganov committed 6mo ago

August 13, 2025

C

common : add --override-tensor-draft, --cpu-moe-draft and --n-cpu-moe-draft parameters (#15191)

Copilot committed 10mo ago

July 31, 2025

G

server : implement universal assisted decoding (#12635)

g2mt committed 10mo ago

June 6, 2025

G

llama : deprecate llama_kv_self_ API (#14030)

Georgi Gerganov committed 1y ago

April 1, 2025

X

common : refactor downloading system, handle mmproj with -hf option (#12694)

Xuan-Son Nguyen committed 1y ago

March 13, 2025

G

llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)

Georgi Gerganov committed 1y ago

January 12, 2025

G

llama : add `llama_vocab`, functions -> methods, naming (#11110)

Georgi Gerganov committed 1y ago

January 3, 2025

G

llama : refactor `src/llama.cpp` (#10902)

Georgi Gerganov committed 1y ago

November 26, 2024

G

cmake : enable warnings in llama (#10474)

Georgi Gerganov committed 1y ago

G

speculative : simplify the implementation (#10504)

Georgi Gerganov committed 1y ago

November 25, 2024

D

llama : accept a list of devices to use to offload a model (#10497)

Diego Devesa committed 1y ago

G

speculative : refactor and add a simpler example (#10362)

Georgi Gerganov committed 1y ago