0xSero/turboquant

0xSero / turboquant UNCLAIMED

TurboQuant: Near-optimal KV cache quantization for LLM inference (3-bit keys, 2-bit values) with Triton kernels + vLLM integration

0 0 0

Quick setup

Get started by creating a new file or uploading an existing file. We recommend every repository include a README, LICENSE, and .gitignore.

https://gitmorph.com/0xSero/turboquant.git

CREATE A NEW REPOSITORY ON THE COMMAND LINE

touch README.md
git init
git checkout -b main
git add README.md
git commit -m "first commit"
git remote add origin https://gitmorph.com/0xSero/turboquant.git
git push -u origin main

PUSH AN EXISTING REPOSITORY

git remote add origin https://gitmorph.com/0xSero/turboquant.git
git push -u origin main