Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

30975 0 0 Python

docs: update `optimizer_zero_grad` order, and the backward pass. (#21144)

GdoongMathew committed 7mo ago

06bed20190c2e428c00ef73c6aa70ab423b2a47a

Parent: da7f2f9

Committed by GitHub <[email protected]> on 9/2/2025, 7:57:23 AM