Aligning with Human Judgement: The Role of Pairwise Preference in Large Language Model Evaluators (Liu et al.; COLM 2024) - View it on GitHub
Star
44
Rank
500262