Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Fix: `no_grad` with AMP bug (#20921)
* Disable cache for torch.autocast in amp * Add a test * Only test for bf16-mixed * Implement test to reproduce the issue --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <[email protected]>
B
Bas Krahmer committed
216f9ec90c5bf3554f7cf484accee325f2a15440
Parent: c2564a7
Committed by GitHub <[email protected]>
on 9/2/2025, 9:34:18 PM