A hardware-agnostic (NVIDIA's GPUs and AWS Inferentia accelerators) deployment of text-to-image (Stable Diffusion 2.1) app and generate text (Llama3) on EKS controlled by K8s ingress in routing-time and Karpenter in scheduling-time that is scaled by KEDA. -
View it on GitHub