🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Adding gradient checkpointing to GPT2 (#7446)
* GPT2 gradient checkpointing * find_unused_parameters removed if checkpointing * find_unused_parameters removed if checkpointing * Update src/transformers/configuration_gpt2.py Co-authored-by: Patrick von Platen <[email protected]> * Added a test for generation with checkpointing * Update src/transformers/configuration_gpt2.py Co-authored-by: Sylvain Gugger <[email protected]> Co-authored-by: Patrick von Platen <[email protected]> Co-authored-by: Sylvain Gugger <[email protected]>
T
Teven committed
9e9a1fb8c75e2ef00fea9c4c0dc511fc0178081c
Parent: 52e8392
Committed by GitHub <[email protected]>
on 9/29/2020, 4:26:26 PM