Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
COMMITS
May 30, 2024
A
Update code owners file (#19922)
awaelchli committed
May 28, 2024
A
Add Studio badge to tensor parallel docs (#19913)
awaelchli committed
May 23, 2024
A
Error for unsupported precision types with ModelParallelStrategy (#19902)
awaelchli committed
A
(10/10) Support 2D Parallelism - Port Fabric docs to PL (#19899)
awaelchli committed
A
[TPU] Fix test assertion error from artifacts (#19825)
awaelchli committed
J
docs: prune unused `linkcode` (#19897)
Jirka Borovec committed
May 22, 2024
A
(9/n) Support 2D Parallelism - Remaining Checkpoint Logic (#19888)
awaelchli committed
J
docs: fix link to CLIP (#19896)
Jirka Borovec committed
A
(8/n) Support 2D Parallelism - 2D Parallel Fabric Docs (#19887)
awaelchli committed
A
Remove the requirement for FSDPStrategy subclasses to only support GPU (#19894)
awaelchli committed
A
(7/n) Support 2D Parallelism - TP Fabric Docs (#19884)
awaelchli committed
May 21, 2024
A
G
Update `LearningRateMonitor` docs and tests for `log_weight_decay` (#19805)
Gilles Peiffer committed
May 20, 2024
A
Enable loss-parallel in example (#19882)
awaelchli committed
A
Remove redundant code to set the device on the LightningModule (#19877)
awaelchli committed
May 19, 2024
L
[App] Extend retry to 4xx except 400, 401, 403, 404 (#19842)
Luca Antiga committed
A
(6/n) Support 2D Parallelism - Trainer example (#19879)
awaelchli committed
May 17, 2024
A
(5/n) Support 2D Parallelism in Lightning Trainer (#19878)
awaelchli committed
A
(4/n) Support 2D Parallelism - Loading optimizer states correctly (#19872)
awaelchli committed
May 15, 2024
A
A
(2/n) Support 2D Parallelism - Distributed Checkpoints (#19852)
awaelchli committed
May 9, 2024
T
Update Lightning Cloud 0.5.69 (#19857)
thomas chaton committed
T
Reduce queue fetching (#19856)
thomas chaton committed
May 8, 2024
A
Add function to explicitly mark forward methods in Fabric (#19690)
awaelchli committed
May 7, 2024
A
(1/n) Support 2D Parallelism (#19846)
awaelchli committed
May 1, 2024
A
bump lightning cloud
Adrian Wälchli committed
L
xfail tests for deprecated functionality
Luca Antiga committed
L
Fix formatting
Luca Antiga committed
L
Make sure the HTTP client for queues retries for POST and 5xx
Luca Antiga committed
April 29, 2024
A
Fix TensorBoardLogger test on Windows (#19824)
Adrian Wälchli committed