[2602.15738] Beyond Labels: Information-Efficient Human-in-the-Loop Learning using Ranking and Selection Queries
Summary
This article presents a novel human-in-the-loop framework for machine learning that enhances information efficiency by utilizing ranking and selection queries instead of traditional labeling methods.
Why It Matters
The research addresses limitations in current human-in-the-loop systems by proposing a framework that improves the interaction between human experts and machine learning models. This approach not only optimizes the learning process but also significantly reduces the amount of data required, making it highly relevant for applications in fields like AI and data science.
Key Takeaways
- Introduces a framework that enhances human-machine interaction in learning.
- Utilizes ranking and selection queries to improve information efficiency.
- Demonstrates significant reductions in sample complexity in experiments.
- Proposes active learning algorithms that balance information gain with cost.
- Reduces learning time by over 57% compared to traditional methods.
Computer Science > Human-Computer Interaction arXiv:2602.15738 (cs) [Submitted on 17 Feb 2026] Title:Beyond Labels: Information-Efficient Human-in-the-Loop Learning using Ranking and Selection Queries Authors:Belén Martín-Urcelay, Yoonsang Lee, Matthieu R. Bloch, Christopher J. Rozell View a PDF of the paper titled Beyond Labels: Information-Efficient Human-in-the-Loop Learning using Ranking and Selection Queries, by Bel\'en Mart\'in-Urcelay and 3 other authors View PDF Abstract:Integrating human expertise into machine learning systems often reduces the role of experts to labeling oracles, a paradigm that limits the amount of information exchanged and fails to capture the nuances of human judgment. We address this challenge by developing a human-in-the-loop framework to learn binary classifiers with rich query types, consisting of item ranking and exemplar selection. We first introduce probabilistic human response models for these rich queries motivated by the relationship experimentally observed between the perceived implicit score of an item and its distance to the unknown classifier. Using these models, we then design active learning algorithms that leverage the rich queries to increase the information gained per interaction. We provide theoretical bounds on sample complexity and develop a tractable and computationally efficient variational approximation. Through experiments with simulated annotators derived from crowdsourced word-sentiment and image-aesthetic datasets,...