Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

0 0 0 Python

[feat] Allow overriding optimizer_zero_grad and/or optimizer_step when using accumulate_grad_batches (#7980)

David Chan committed 4y ago

c6e02e481eebaa48eda3877ab79a749e8635c500

Parent: eebdc91

Committed by GitHub <[email protected]> on 6/17/2021, 10:50:37 AM