Running large language models like OPT-175B/GPT-3 on a single GPU. Up to 100x faster than other offloading systems. - View it on GitHub
Star
0
Rank
13809189