A GPU-accelerated DNN inference serving system that supports instant kernel preemption and biased concurrent execution in GPU scheduling. - View it on GitHub
Star
39
Rank
504743