SIGN IN SIGN UP
hpcaitech / ColossalAI UNCLAIMED

Making large AI models cheaper, faster and more accessible

0 0 38 Python

[shardformer] support sharded optimizer checkpointIO of HybridParallelPlugin (#4540)

* implement sharded optimizer saving

* add more param info

* finish implementation of sharded optimizer saving

* fix bugs in optimizer sharded saving

* add pp+zero test

* param group loading

* greedy loading of optimizer

* fix bug when loading

* implement optimizer sharded saving

* add optimizer test & arrange checkpointIO utils

* fix gemini sharding state_dict

* add verbose option

* add loading of master params

* fix typehint

* fix master/working mapping in fp16 amp
B
Baizhou Zhang committed
c9625dbb6364c10f21828b30bc58e8fbcf22a900
Parent: 2c787d7
Committed by GitHub <noreply@github.com> on 8/31/2023, 6:50:47 AM