🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Add voxtral (#39429)
* draft * draft update (conversion working) * mend * draft update * draft update: working generate * refactor * VoxtralProcessor draft * processor update * update convert_tekken_tokenizer * refactor processor * update convert * make style * better handle prefil * make style * add tests * add mistral_common audio loading * processor update * revert changes * audio utils update * add audio to apply chat template mistral update * voxtral processor update * fix * udpate converstion script * make mistral tokenier from pretrain work from local dir * fix udpates * add integration tests * add batched version * processor docstring * make style * revert convert_tekken_tokenizer changes * revert processing_qwen2.5 changes * add multi-turn test * processor improvements * address review changes * Update src/transformers/tokenization_mistral_common.py Co-authored-by: Julien Denize <[email protected]> * update audio utils * nits * integration test update * correct _support * update tests * test update * update integration tests * fix * fix * fix * add test_apply_chat_template_with_audio * add model doc * model doc * nit * doc uptade * nit * processor improvement * ensure default is 3B * nits * make * make * convert modular * update checkpoint * fix test * make * make * autos * make * make * nit * nit * nit --------- Co-authored-by: Julien Denize <[email protected]> Co-authored-by: Arthur <[email protected]>
E
eustlb committed
967045082faaaaf3d653bfe665080fd746b2bb60
Parent: 73869f2
Committed by GitHub <[email protected]>
on 7/18/2025, 12:02:04 AM