SIGN IN SIGN UP

Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.

30975 0 0 Python

Internal Refactor: Reroute Implementations (#21354)

* forward xla impl

* forward logger implementation

* forward logger implementation: mlflow

* update neptune logger

* forward kubeflow implementation

* forward lsf env

* move torchelastic

* update xla env

* forward bitsandbytes

* forward deepspeed precision

* forward transformer engine

* forward XLA precision

* forward deepspeed strategy fabric

* integrate xla strategies

* update pytorch deepspeed precision

* forward trainer xla single device

* XLA ddp trainer

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update fabric tests

* fabric tests

* tests

* update version

* update

* update

* update

* update

* update

* update

* fix doc issue

* fix mypy issue

* fix readthedocs and ci cpu tests

* update

* update

* update

* update

* update

* update

* fix deepspeed assertion

* update

* fix transformer engine mock

* update

* logger mocks

* add tpu mocks

* update

* update

* update

* update

* fix docmake

* update

* update

* fix loggers error

* update

* update

* update

* update

* pin cuda version

* update

* try with removing libnccl downloading

* undo cuda pinning

* update

* update

* corretly handle model property

* update error types and add property forwarding

* update

* update

* update

* meow meow

* claymore!!!

* remove todo

* remove todos + version

* retrigger-ci to fix ple release issue

* fix mocks xla

---------

Co-authored-by: Deependu Jha <[email protected]>
Co-authored-by: Bhimraj Yadav <[email protected]>
J
Justus Schock committed
9a10959f255a3a1700da525114c1f1070fba5ded
Parent: 8ac4843
Committed by GitHub <[email protected]> on 11/21/2025, 11:54:12 AM