Gitstar Ranking
Users
Organizations
Repositories
Rankings
Users
Organizations
Repositories
Sign in with GitHub
L1aoXingyu
Fetched on 2025/03/15 09:10
L1aoXingyu
/
marlin
FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens. -
View it on GitHub
Star
0
Rank
12125866