COMMITS
/ examples/embedding/embedding.cpp March 31, 2026
A
common : move up common_init() and fix Windows UTF-8 logs (#21176)
Adrien Gallouët committed
March 4, 2026
S
Fix locale-dependent float printing in GGUF metadata (#17331)
SamareshSingh committed
January 5, 2026
T
model : add LFM2-ColBert-350M (#18607)
Tarek Dakhran committed
December 14, 2025
G
common : refactor common_sampler + grammar logic changes (#17937)
Georgi Gerganov committed
November 28, 2025
D
October 28, 2025
S
embedding: add raw option for --embd-output-format (#16541)
Sam Malayek committed
September 25, 2025
D
llama : add support for qwen3 reranker (#15824)
Douglas Hanley committed
July 30, 2025
G
tests : update for LLAMA_SET_ROWS=1 (#14961)
Georgi Gerganov committed
July 16, 2025
G
llama : add high-throughput mode (#14363)
Georgi Gerganov committed
June 20, 2025
S
llama : improve sep token handling (#14272)
Sigbjørn Skjæret committed
June 6, 2025
G
llama : deprecate llama_kv_self_ API (#14030)
Georgi Gerganov committed
S
llama : support multiple classifier outputs and labels (#13940)
Sigbjørn Skjæret committed
May 26, 2025
G
examples : allow extracting embeddings from decoder contexts (#13797)
Georgi Gerganov committed
May 8, 2025
G
context : allow cache-less context for embeddings (#13108)
Georgi Gerganov committed
April 24, 2025
G
embeddings : fix batch sizes (#13076)
Georgi Gerganov committed
March 13, 2025
G
llama : refactor llama_context, llama_kv_cache, llm_build_context (#12181)
Georgi Gerganov committed
March 4, 2025
M
ggml : portability fixes for VS 2017 (#12150)
mgroeber9110 committed
January 12, 2025
G
llama : add `llama_vocab`, functions -> methods, naming (#11110)
Georgi Gerganov committed
January 3, 2025
G
llama : refactor `src/llama.cpp` (#10902)
Georgi Gerganov committed
October 10, 2024
D
common : use common_ prefix for common library functions (#9805)
Diego Devesa committed
September 28, 2024
G
llama : add reranking support (#9510)
Georgi Gerganov committed
September 15, 2024
G
common : reimplement logging (#9418)
Georgi Gerganov committed
September 13, 2024
G
llama : llama_perf + option to disable timings during decode (#9355)
Georgi Gerganov committed
September 10, 2024
S
llama : move random seed generation to the samplers (#9398)
slaren committed
September 9, 2024
X
common : move arg parser code to `arg.cpp` (#9388)
Xuan Son Nguyen committed
September 7, 2024
X
common : refactor arg parser (#9308)
Xuan Son Nguyen committed
G
llama : refactor sampling v2 (#9294)
Georgi Gerganov committed
August 10, 2024
F
Add support for encoder-only T5 models (#8900)
fairydreaming committed
August 5, 2024
L
common : Changed tuple to struct (TODO fix) (#8823)
Liu Jia committed
June 24, 2024
Y
embedding : more cli arguments (#7458)
Yann Follet committed
June 21, 2024
D
llama : allow pooled embeddings on any model (#7477)
Douglas Hanley committed
June 4, 2024
G
common : refactor cli arg parsing (#7675)
Georgi Gerganov committed
May 22, 2024
G
common : normalize naming style (#7462)
Georgi Gerganov committed
May 15, 2024
D
embedding : free the batch after execution (#7297)
dm4 committed
May 11, 2024
J
llama : add Jina Embeddings architecture (#6826)
Joan Fontanals committed
April 9, 2024
J
BERT tokenizer fixes (#6498)
Jared Van Bortel committed
March 27, 2024
H
embedding : show full embedding for single prompt (#6342)
howlger committed
March 26, 2024
M
embedding : adjust `n_ubatch` value (#6296)
Minsoo Cheong committed
March 14, 2024
G
embedding : add EOS token if not present (#899)
Georgi Gerganov committed
G
embedding : print all resulting embeddings (#899)
Georgi Gerganov committed
G
embedding : print cosine similarity (#899)
Georgi Gerganov committed
March 13, 2024
S
llama : add pipeline parallelism support (#6017)
slaren committed
March 9, 2024
S
server : normalize embeddings (#5956)
SeungWon Jeong committed
March 4, 2024
G
llama : fix embeddings (#5796)
Georgi Gerganov committed
February 16, 2024
B
ggml : add numa options (#5377)
bmwl committed
February 13, 2024
D
llama : support batched embeddings (#5466)
Douglas Hanley committed
February 11, 2024
D
Add support for BERT embedding models (#5423)
Douglas Hanley committed
November 2, 2023
C
build : link against build info instead of compiling against it (#3879)
cebtenzzre committed
September 28, 2023
S
G
llama : custom attention mask + parallel decoding + no context swaps (#3228)
Georgi Gerganov committed