A lightweight GenAI workload management tool that optimizes GPU utilization and job makespan, supports job scaling, GPU sharing and ensures SLA adherence, while managing diverse GenAI workloads without relying on Kubernetes and Kueue. -
View it on GitHub