Commits: examples/batched/batched.cpp - ggml-org/llama.cpp

ggml-org / llama.cpp UNCLAIMED

LLM inference in C/C++

0 0 0 C++

COMMITS

/ examples/batched/batched.cpp

master

March 31, 2026

common : move up common_init() and fix Windows UTF-8 logs (#21176)

Adrien Gallouët committed 19d ago

41361c8

March 4, 2026

Fix locale-dependent float printing in GGUF metadata (#17331)

SamareshSingh committed 1mo ago

cb8f4fa

January 15, 2026

context : reserve new scheduler when graph topology changes (#18547)

Georgi Gerganov committed 3mo ago

39173bc

January 12, 2026

examples : add --kv-unified to batched example (#18774)

Daniel Bevenius committed 3mo ago

4150da9

January 4, 2026

sampling : add support for backend sampling (#17004)

Daniel Bevenius committed 3mo ago

d3dce4e

December 14, 2025

common : refactor common_sampler + grammar logic changes (#17937)

Georgi Gerganov committed 4mo ago

254098a

April 1, 2025

common : refactor downloading system, handle mmproj with -hf option (#12694)

Xuan-Son Nguyen committed 1y ago

267c139

January 12, 2025

llama : add `llama_vocab`, functions -> methods, naming (#11110)

Georgi Gerganov committed 1y ago

afa8a9e

January 6, 2025

llama : update llama_model API names (#11063)

Georgi Gerganov committed 1y ago

47182dd

llama : use LLAMA_TOKEN_NULL (#11062)

Georgi Gerganov committed 1y ago

727368c

December 16, 2024

sampling : refactor + optimize penalties sampler (#10803)

Georgi Gerganov committed 1y ago

644fd71

November 25, 2024

speculative : refactor and add a simpler example (#10362)

Georgi Gerganov committed 1y ago

d9d54e4

October 10, 2024

common : use common_ prefix for common library functions (#9805)

Diego Devesa committed 1y ago

7eee341

September 15, 2024

common : reimplement logging (#9418)

Georgi Gerganov committed 1y ago

6262d13

September 13, 2024

llama : llama_perf + option to disable timings during decode (#9355)

Georgi Gerganov committed 1y ago

0abc6a2

September 9, 2024

common : move arg parser code to `arg.cpp` (#9388)

Xuan Son Nguyen committed 1y ago

bfe76d4

llama : minor sampling refactor (2) (#9386)

slaren committed 1y ago

5fb5e24

September 7, 2024

common : refactor arg parser (#9308)

Xuan Son Nguyen committed 1y ago

1b9ae51

llama : refactor sampling v2 (#9294)

Georgi Gerganov committed 1y ago

df270ef

July 17, 2024

batched: fix n_predict parameter (#8527)

Masaya, Kato committed 1y ago

da3913d

July 4, 2024

Inference support for T5 and FLAN-T5 model families (#5763)

fairydreaming committed 1y ago

807b0c4

June 4, 2024

common : refactor cli arg parsing (#7675)

Georgi Gerganov committed 1y ago

1442677

May 22, 2024

common : normalize naming style (#7462)

Georgi Gerganov committed 1y ago

6ff1398

April 21, 2024

llama : support Llama 3 HF conversion (#6745)

Pedro Cuenca committed 2y ago

b97bc39

March 22, 2024

metal : pad n_ctx by 32 (#6177)

Georgi Gerganov committed 2y ago

95d576b

March 11, 2024

llama : more consistent names of count variables (#5994)

Georgi Gerganov committed 2y ago

05b0621

March 8, 2024

llama : support Mamba Selective State Space Models (#5328)

compilade committed 2y ago

c2101a2

February 18, 2024

ggml, common, examples, tests : fixed type arguments in printf (#5528)

Herman Semenov committed 2y ago

5d3de51

February 16, 2024

ggml : add numa options (#5377)

bmwl committed 2y ago

f486f6e

January 8, 2024

examples : add passkey test (#3856)

Georgi Gerganov committed 2y ago

b0034d9

October 24, 2023

cuda : add batched cuBLAS GEMM for faster attention (#3749)

Georgi Gerganov committed 2y ago

2b4ea35

October 23, 2023

llama : remove token functions with `context` args in favor of `model` (#3720)

Marcus Dunn committed 2y ago

5be6c80

October 22, 2023

batched : add len CLI argument

Georgi Gerganov committed 2y ago

22c69a2

October 18, 2023

speculative : add tree-based sampling example (#3624)

Georgi Gerganov committed 2y ago

0e89203

October 11, 2023

batched : add bench tool (#3545)

Georgi Gerganov committed 2y ago

8c70a5f

September 28, 2023

llama.cpp : split llama_context_params into model and context params (#3301)

slaren committed 2y ago

16bc66d

llama : custom attention mask + parallel decoding + no context swaps (#3228)

Georgi Gerganov committed 2y ago

ec89379