Making large AI models cheaper, faster and more accessible
[Inference]Adapt repetition_penalty and no_repeat_ngram_size (#5708)
* Adapt repetition_penalty and no_repeat_ngram_size * fix no_repeat_ngram_size_logit_process * remove batch_updated * fix annotation * modified codes based on the review feedback. * rm get_batch_token_ids
Y
yuehuayingxueluo committed
de4bf3dedf2c7cb7ba6c3044745bab3c3ef6352d
Parent: 50104ab
Committed by GitHub <noreply@github.com>
on 5/11/2024, 7:13:25 AM