Wiki: Home - 0xSero/turboquant

0xSero / turboquant UNCLAIMED

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

0 0 0

PAGES (0)

This wiki doesn't have any pages yet. Create the Home page to get started.