Making large AI models cheaper, faster and more accessible
[npu] add npu support for hybrid plugin and llama (#5090)
* llama 3d * update * fix autocast
X
Xuanlei Zhao committed
3acbf6d4968e0559629f0d6d317e5bac41ad5df0
Parent: aae4966
Committed by GitHub <noreply@github.com>
on 11/22/2023, 11:23:21 AM