Block scaling for fp8 quantization matrix multiplication. Solution to GPU mode AMD challenge. Additionally, this repo includes codes for quantizing Pytorch bf16 matmul with fp8. - View it on GitHub
Star
0
Rank
13805525