Fast fp16-fp8 mixed precision matmul on RDNA3/3.5 GPUs without native fp8 - View it on GitHub
Star
19
Rank
1037327