sampling : avoid expensive softmax during greedy sampling (#9605)
* sampling : avoid expensive softmax during greedy sampling ggml-ci * speculative : fix default RNG seed + set sparams.n_probs * Update tests/test-sampling.cpp Co-authored-by: slaren <[email protected]> * sampling : add clarifying comment [no ci] --------- Co-authored-by: slaren <[email protected]>
G
Georgi Gerganov committed
b0f27361f3539a81d983a8b045f3c61e682d9fc0
Parent: c087b6f
Committed by GitHub <[email protected]>
on 9/24/2024, 6:03:17 AM