A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling. - View it on GitHub
Star
43
Rank
567529