prima.cpp: Speeding up 70B-scale LLM inference on low-resource everyday home clusters - View it on GitHub
Star
1
Rank
5978516