Efficient LLM query routing via multi-sampling. BEST-Route selects both model and number of responses based on query difficulty, cutting costs by up to 60% with <1% performance drop. From the paper:https://arxiv.org/abs/2506.22716 -
View it on GitHub