Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Tuner cleanup on error (#21162)
* Make sure temp checkpoints are cleaned up on failed tuning * add testing * changelog --------- Co-authored-by: Jirka Borovec <[email protected]> (cherry picked from commit e1e2534d32cad1d2bad87f5b03da857108773bba)
N
Nicki Skafte Detlefsen committed
17d2d784070038688ec0f5885c22523e5923b635
Parent: 509f562
Committed by Luca Antiga <[email protected]>
on 9/5/2025, 1:14:02 PM