IBM/Anchor-Selection - Gitstar Ranking

IBM

Fetched on 2026/05/31 09:45

Code for the paper 'Mediocrity is the key for LLM as a Judge Anchor Selection'. This project enables systematic pairwise evaluation of multiple models on Arena-hard and AlpacaEval datasets, examining the effect of the chosen `anchor', i.e., the model to which all the other evaluated models are compared. - View it on GitHub

Star

Rank

6120052

IBM

IBM / Anchor-Selection