๐ค Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
[Model] Add PaddleOCR-VL Model Support (#42178)
* init * refactor * update * update * fix unresolved problems * fix how position_ids work with flash_attn_2 * add tests and fix code * add model_doc * update model_doc * fix ci * update docstring * add tests * update * add **kwargs * update * update * update * reduce max_position_embeddings in tests * update
Z
zhang-prog committed
8c84144bfc7dd0c9c5e336a6d89c9dcee2efc2a8
Parent: 78b2992
Committed by GitHub <[email protected]>
on 12/11/2025, 2:27:04 PM