[2603.21485] Off-Policy Evaluation for Ranking Policies under

[2603.21485] Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies

arXiv - Machine Learning March 24, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.21485: Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies

Computer Science > Machine Learning arXiv:2603.21485 (cs) [Submitted on 23 Mar 2026] Title:Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies Authors:Koichi Tanaka, Kazuki Kawamura, Takanori Muroi, Yusuke Narita, Yuki Sasamoto, Kei Tateno, Takuma Udagawa, Wei-Wei Du, Yuta Saito View a PDF of the paper titled Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies, by Koichi Tanaka and 8 other authors View PDF HTML (experimental) Abstract:Off-Policy Evaluation (OPE) is an important practical problem in algorithmic ranking systems, where the goal is to estimate the expected performance of a new ranking policy using only offline logged data collected under a different, logging policy. Existing estimators, such as the ranking-wise and position-wise inverse propensity score (IPS) estimators, require the data collection policy to be sufficiently stochastic and suffer from severe bias when the logging policy is fully deterministic. In this paper, we propose novel estimators, Click-based Inverse Propensity Score (CIPS), exploiting the intrinsic stochasticity of user click behavior to address this challenge. Unlike existing methods that rely on the stochasticity of the logging policy, our approach uses click probability as a new form of importance weighting, enabling low-bias OPE even under deterministic logging policies where existing methods incur substantial bias. We provide theoretical analyses of the bias and variance proper...

Originally published on March 24, 2026. Curated by AI News.

Ai Safety

Washington needs AI guardrails — now | Opinion

We need legislation that draws clear lines on what AI systems may and may not do on behalf of the United States government

AI Tools & Products · 3 min · about 3 hours ago

Ai Safety

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

Abstract page for arXiv paper 2601.12910: SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

arXiv - AI · 3 min · about 9 hours ago

Machine Learning

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

Abstract page for arXiv paper 2509.21385: Debugging Concept Bottleneck Models through Removal and Retraining

arXiv - Machine Learning · 4 min · about 9 hours ago

Llms

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

Abstract page for arXiv paper 2512.00804: Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

arXiv - AI · 4 min · about 9 hours ago

[2603.21485] Off-Policy Evaluation for Ranking Policies under Deterministic Logging Policies

About this article

Related Articles

Washington needs AI guardrails — now | Opinion

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

No comments

Stay updated with AI News