This repo presents resilience patterns for scaling inference for Generative AI workloads on AWS: Bedrock cross-Region inference, AWS account sharding, and intelligent routing with LLM gateways. - View it on GitHub
Star
3
Rank
3099524