Deploy a complete, self-hosted AI stack on your own server with one command. Includes Ollama (LLM), AnythingLLM (chat UI), LiteLLM (AI gateway), Whisper (STT), Kokoro (TTS), Embeddings (RAG), and MCP Gateway. Most services run locally; LiteLLM optionally routes to external providers. Supports NVIDIA GPU (CUDA) acceleration. -
View it on GitHub