SIGN IN SIGN UP
hpcaitech / ColossalAI UNCLAIMED

Making large AI models cheaper, faster and more accessible

0 0 38 Python

[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)

* fix imports

* add ray-serve with Colossal-Infer tp

* trivial: send requests script

* add README

* fix worker port

* fix readme

* use app builder and autoscaling

* trivial: input args

* clean code; revise readme

* testci (skip example test)

* use auto model/tokenizer

* revert imports fix (fixed in other PRs)
Y
Yuanheng Zhao committed
573f270537ac1ec1de4aef80a8f74b42af89db35
Parent: 3a74eb4
Committed by GitHub <noreply@github.com> on 10/2/2023, 9:48:38 AM