Making large AI models cheaper, faster and more accessible
[Inference/Feat] Add kvcache quant support for fused_rotary_embedding_cache_copy (#5680)
傅
傅剑寒 committed
ef8e4ffe310bfe21f83feb965d962d816d75bc88
Parent: 5cd75ce
Committed by GitHub <noreply@github.com>
on 4/30/2024, 10:33:53 AM