A modern framework for performance and quality testing of Large Language Models (LLMs): run automated JMeter load tests, capture token-level metrics, analyze response quality with DeepEval, and benchmark your LLMs against industry leaderboards—all with easy integration for Retrieval-Augmented Generation (RAG) and customizable datasets. - View it on GitHub
Star
0
Rank
13855001