A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen. - View it on GitHub
Star
3740
Rank
10721