[2603.02029] Rich Insights from Cheap Signals: Efficient Evaluations via Tensor Factorization
About this article
Abstract page for arXiv paper 2603.02029: Rich Insights from Cheap Signals: Efficient Evaluations via Tensor Factorization
Computer Science > Artificial Intelligence arXiv:2603.02029 (cs) [Submitted on 2 Mar 2026] Title:Rich Insights from Cheap Signals: Efficient Evaluations via Tensor Factorization Authors:Felipe Maia Polo, Aida Nematzadeh, Virginia Aglietti, Adam Fisch, Isabela Albuquerque View a PDF of the paper titled Rich Insights from Cheap Signals: Efficient Evaluations via Tensor Factorization, by Felipe Maia Polo and 4 other authors View PDF HTML (experimental) Abstract:Moving beyond evaluations that collapse performance across heterogeneous prompts toward fine-grained evaluation at the prompt level, or within relatively homogeneous subsets, is necessary to diagnose generative models' strengths and weaknesses. Such fine-grained evaluations, however, suffer from a data bottleneck: human gold-standard labels are too costly at this scale, while automated ratings are often misaligned with human judgment. To resolve this challenge, we propose a novel statistical model based on tensor factorization that merges cheap autorater data with a limited set of human gold-standard labels. Specifically, our approach uses autorater scores to pretrain latent representations of prompts and generative models, and then aligns those pretrained representations to human preferences using a small calibration set. This sample-efficient methodology is robust to autorater quality, more accurately predicts human preferences on a per-prompt basis than standard baselines, and provides tight confidence intervals for...