Making large AI models cheaper, faster and more accessible
[Gemini] add some code for reduce-scatter overlap, chunk prefetch in llama benchmark. (#5751)
* [bugs] fix args.profile=False DummyProfiler errro * add args.prefetch_num for benchmark
H
Haze188 committed
4d097def9637a67629a988c269093c46ac3e7cbf
Parent: ca67454
Committed by GitHub <noreply@github.com>
on 5/25/2024, 3:00:13 PM