SIGN IN SIGN UP

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

0 0 0 Python

Support loading Quark quantized models in Transformers (#36372)

* add quark quantizer

* add quark doc

* clean up doc

* fix tests

* make style

* more style fixes

* cleanup imports

* cleaning

* precise install

* Update docs/source/en/quantization/quark.md

Co-authored-by: Marc Sun <[email protected]>

* Update tests/quantization/quark_integration/test_quark.py

Co-authored-by: Marc Sun <[email protected]>

* Update src/transformers/utils/quantization_config.py

Co-authored-by: Marc Sun <[email protected]>

* remove import guard as suggested

* update copyright headers

* add quark to transformers-quantization-latest-gpu Dockerfile

* make tests pass on transformers main + quark==0.7

* add missing F8_E4M3 and F8_E5M2 keys from str_to_torch_dtype

---------

Co-authored-by: Marc Sun <[email protected]>
Co-authored-by: Bowen Bao <[email protected]>
Co-authored-by: Mohamed Mekkouri <[email protected]>
F
fxmarty-amd committed
1a374799cedcf7a7df14256226a7576c324a42fa
Parent: ce091b1
Committed by GitHub <[email protected]> on 3/20/2025, 2:40:51 PM