[2603.20990] ECI: Effective Contrastive Information to Evaluate Hard-Negatives
About this article
Abstract page for arXiv paper 2603.20990: ECI: Effective Contrastive Information to Evaluate Hard-Negatives
Computer Science > Information Retrieval arXiv:2603.20990 (cs) [Submitted on 22 Mar 2026] Title:ECI: Effective Contrastive Information to Evaluate Hard-Negatives Authors:Aarush Sinha, Rahul Seetharaman, Aman Bansal View a PDF of the paper titled ECI: Effective Contrastive Information to Evaluate Hard-Negatives, by Aarush Sinha and 2 other authors View PDF HTML (experimental) Abstract:Hard negatives play a critical role in training and fine-tuning dense retrieval models, as they are semantically similar to positive documents yet non-relevant, and correctly distinguishing them is essential for improving retrieval accuracy. However, identifying effective hard negatives typically requires extensive ablation studies involving repeated fine-tuning with different negative sampling strategies and hyperparameters, resulting in substantial computational cost. In this paper, we introduce ECI: Effective Contrastive Information , a theoretically grounded metric grounded in Information Theory and Information Retrieval principles that enables practitioners to assess the quality of hard negatives prior to model fine-tuning. ECI evaluates negatives by optimizing the trade-off between Information Capacity the logarithmic bound on mutual information determined by set size and Discriminative Efficiency, a harmonic balance of Signal Magnitude (Hardness) and Safety (Max-Margin). Unlike heuristic approaches, ECI strictly penalizes unsafe, false-positive negatives prevalent in generative methods....