Making large AI models cheaper, faster and more accessible
[Shardformer] change qwen2 modeling into gradient checkpointing style (#5874)
J
Jianghai committed
8ab46b4000d36c76cde93c6bb553411e815980fb
Parent: 416580b
Committed by GitHub <noreply@github.com>
on 7/1/2024, 8:45:09 AM