Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Use fs.pipe() for S3/GCS checkpoint uploads in _atomic_save (#21595)
* Use fs.pipe() for S3/GCS checkpoint uploads in _atomic_save Replace fs.open() + f.write() with fs.pipe() for S3 and GCS filesystems, enabling parallel multipart uploads. This gives 4-5x throughput improvement for checkpoints >= 500 MB. Azure is excluded because adlfs stages blocks sequentially, making fs.pipe() slower than f.write(). Fixes #21499 Signed-off-by: c-pozzi <corina.rios@gmail.com> * Use isinstance check for Azure exclusion in _atomic_save Use module_available + isinstance instead of string comparison to detect AzureBlobFileSystem. Fix test mocking to patch module_available so the isinstance check works without adlfs installed. Signed-off-by: c-pozzi <corina.rios@gmail.com> * Fix ruff SIM117: combine nested with statements in tests Signed-off-by: c-pozzi <corina.rios@gmail.com> --------- Signed-off-by: c-pozzi <corina.rios@gmail.com>
C
c-pozzi committed
6805188c711094339e76d81668a4037dd56f7c7a
Parent: 7983ecb
Committed by GitHub <noreply@github.com>
on 3/21/2026, 11:02:24 AM