🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Restore cuda graphs to continuous batching (#41421)
* Type hints and small fixes * Remove unusued params * Made slice inputs the default * ruffed * Updated some var name and moved index slicing * Logging arg in example * Added some padding debug var and reformat out cg * First working CG, fixe size * Working flexible CG * CG are compatible with all implementations * Fixed CG API * Update example * Documentation * Fix padding tokens in FA * Review compliance * Better doc around weird bug * Style * Fix for sliding with CG
R
Rémi Ouazan committed
cf1e9834ec7339f4c605ba96d9c4e5cf59594cad
Parent: 6c901bd
Committed by GitHub <[email protected]>
on 10/13/2025, 9:57:56 AM