Fused Quantized GEMV Kernels for LLM Inference - 2.8x faster than cuBLAS - View it on GitHub
Star
0
Rank
13823098