COMMITS
/ common/sampling.cpp November 25, 2024
G
speculative : refactor and add a simpler example (#10362)
Georgi Gerganov committed
October 29, 2024
G
llama : remove Tail-Free sampling (#10071)
Georgi Gerganov committed
October 25, 2024
W
llama : add DRY sampler (#9702)
wwoodsTM committed
October 21, 2024
G
llama : default sampling changes + greedy update (#9897)
Georgi Gerganov committed
October 15, 2024
G
llama : add infill sampler (#9896)
Georgi Gerganov committed
M
sampling : add XTC sampler (#9742)
MaggotHATE committed
October 10, 2024
D
common : use common_ prefix for common library functions (#9805)
Diego Devesa committed
September 24, 2024
G
sampling : avoid expensive softmax during greedy sampling (#9605)
Georgi Gerganov committed
September 15, 2024
G
common : reimplement logging (#9418)
Georgi Gerganov committed
September 13, 2024
G
llama : llama_perf + option to disable timings during decode (#9355)
Georgi Gerganov committed
September 10, 2024
S
llama : move random seed generation to the samplers (#9398)
slaren committed
September 9, 2024
X
common : move arg parser code to `arg.cpp` (#9388)
Xuan Son Nguyen committed
September 7, 2024
G
llama : fix empty ring buffer push (#9358)
Georgi Gerganov committed
G
llama : refactor sampling v2 (#9294)
Georgi Gerganov committed
July 23, 2024
G
llama : move vocab, grammar and sampling into separate files (#8508)
Georgi Gerganov committed
July 8, 2024
K
common : preallocate sampling token data vector (#8363)
Kevin Wang committed
K
common : avoid unnecessary logits fetch (#8358)
Kevin Wang committed
June 25, 2024
D
llama : return nullptr from llama_grammar_init (#8093)
Daniel Bevenius committed
May 22, 2024
G
common : normalize naming style (#7462)
Georgi Gerganov committed
May 21, 2024
O
`grammars`: fix resampling logic regression (#7424)
Olivier Chafik committed
May 11, 2024
J
server: fix reported top tokens for temperature 0 (#7203)
Johannes Gäßler committed
May 7, 2024
J
server: fix incorrectly reported token probabilities (#7125)
Johannes Gäßler committed
April 29, 2024
D
sampling : use std::random_device{}() for default random seed (#6962)
David Renshaw committed
April 24, 2024
J
Server: fix seed for multiple slots (#6835)
Johannes Gäßler committed
March 24, 2024
M
sampling : deduplicated code for probability distribution access (#6240)
Minsoo Cheong committed
March 13, 2024
C
grammar : handle missing "root" node (#6004)
Clint Herron committed
March 4, 2024
M
speculative : implement stochastic speculative sampling (#5625)
Minsoo Cheong committed
February 25, 2024
P
server: tests - slow inference causes timeout on the CI (#5715)
Pierrick Hymbert committed
February 18, 2024
R
common, server : surface min_keep as its own parameter (#5567)
Robey Holderith committed
G
sampling : do not set min_keep to n_probs (#5564)
Georgi Gerganov committed
February 16, 2024
A
server : add "samplers" param to control the samplers order (#5494)
Alexey Parfenov committed
February 11, 2024
A
common : use enums for sampler types (#5418)
Alexey Parfenov committed
G
common : fix compile warning
Georgi Gerganov committed
February 8, 2024
J
sampling: fix top_k <= 0 (#5388)
Johannes Gäßler committed
January 27, 2024
M
Remove unused data and add fixes (#5154)
Michael Klimenko committed
January 25, 2024
L
llama : dynamic temperature sampling (#4972)
l3utterfly committed
January 15, 2024
D
llama : apply classifier-free guidance to logits directly (#4951)
David Friehs committed
December 23, 2023
A
server : allow to specify custom prompt for penalty calculation (#3727)
Alexey Parfenov committed
K
grammar : check the full vocab only if necessary (opt) (#4306)
kalomaze committed
December 6, 2023
G
common : fix compile warning
Georgi Gerganov committed
December 5, 2023
M
sampling : custom samplers order (#4285)
MaggotHATE committed
November 1, 2023
L
sampling : null grammar field after reset (#3885)
l3utterfly committed
October 31, 2023
K
samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841)
kalomaze committed
October 28, 2023
G
llama : add option for greedy sampling with probs (#3813)
Georgi Gerganov committed
October 23, 2023
M
llama : remove token functions with `context` args in favor of `model` (#3720)
Marcus Dunn committed
October 20, 2023
G
sampling : refactor init to use llama_sampling_params (#3696)
Georgi Gerganov committed
October 18, 2023
G
speculative : add tree-based sampling example (#3624)
Georgi Gerganov committed
October 11, 2023
K
common : fix mirostat state when using multiple sequences (#3543)
Kerfuffle committed