[2604.06996] Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
About this article
Abstract page for arXiv paper 2604.06996: Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
Computer Science > Computation and Language arXiv:2604.06996 (cs) [Submitted on 8 Apr 2026] Title:Self-Preference Bias in Rubric-Based Evaluation of Large Language Models Authors:José Pombal, Ricardo Rei, André F. T. Martins View a PDF of the paper titled Self-Preference Bias in Rubric-Based Evaluation of Large Language Models, by Jos\'e Pombal and 2 other authors View PDF Abstract:LLM-as-a-judge has become the de facto approach for evaluating LLM outputs. However, judges are known to exhibit self-preference bias (SPB): they tend to favor outputs produced by themselves or by models from their own family. This skews evaluations and, thus, hinders model development, especially in settings of recursive self-improvement. We present the first study of SPB in rubric-based evaluation, an increasingly popular benchmarking paradigm where judges issue binary verdicts on individual evaluation criteria, instead of assigning holistic scores or rankings. Using IFEval, a benchmark with programmatically verifiable rubrics, we show that SPB persists even when evaluation criteria are entirely objective: among rubrics where generators fail, judges can be up to 50\% more likely to incorrectly mark them as satisfied when the output is their own. We also find that, similarly to other evaluation paradigms, ensembling multiple judges helps mitigate SPB, but without fully eliminating it. On HealthBench, a medical chat benchmark with subjective rubrics, we observe that SPB skews model scores by up ...