[2411.01685] Reducing Biases in Record Matching Through Scores Calibration

[2411.01685] Reducing Biases in Record Matching Through Scores Calibration

arXiv - Machine Learning 4 min read Article

Summary

This paper explores methods to reduce biases in record matching through score calibration, proposing two model-agnostic post-processing techniques that align score distributions to enhance fairness without retraining models.

Why It Matters

Bias in record matching can lead to unfair outcomes in various applications, including hiring and credit scoring. This research provides innovative solutions to mitigate score bias, promoting fairness in machine learning systems and ensuring equitable treatment across different demographic groups.

Key Takeaways

  • Introduces a threshold-independent metric for assessing score bias in record matching.
  • Proposes two calibration methods (Calib and C-Calib) to reduce score bias without retraining models.
  • Demonstrates substantial bias reduction with minimal accuracy loss across various benchmarks.

Computer Science > Machine Learning arXiv:2411.01685 (cs) [Submitted on 3 Nov 2024 (v1), last revised 22 Feb 2026 (this version, v3)] Title:Reducing Biases in Record Matching Through Scores Calibration Authors:Mohammad Hossein Moslemi, Mostafa Milani View a PDF of the paper titled Reducing Biases in Record Matching Through Scores Calibration, by Mohammad Hossein Moslemi and 1 other authors View PDF HTML (experimental) Abstract:Record matching models typically output a real-valued matching score that is later consumed through thresholding, ranking, or human review. While fairness in record matching has mostly been assessed using binary decisions at a fixed threshold, such evaluations can miss systematic disparities in the entire score distribution and can yield conclusions that change with the chosen threshold. We introduce a threshold-independent notion of score bias that extends standard group-fairness criteria-demographic parity (DP), equal opportunity (EO), and equalized odds (EOD)-from binary outputs to score functions by integrating group-wise metric gaps over all thresholds. Using this metric, we empirically show that several state-of-the-art deep matchers can exhibit substantial score bias even when appearing fair at commonly used thresholds. To mitigate these disparities without retraining the underlying matcher, we propose two model-agnostic post-processing methods that only require score evaluations on an (unlabeled) calibration set. Calib targets DP by aligning ...

Related Articles

Machine Learning

[D] How do ML engineers view vibe coding?

I've seen, read and heard a lot of mixed reactions about software engineers (ie. the ones who aren't building ML models and make purely d...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I built a simple gpu-aware single-node job scheduler for researchers / students

(reposting in my main account because anonymous account cannot post here.) Hi everyone! I’m a research engineer from a small lab in Asia,...

Reddit - Machine Learning · 1 min ·
Llms

[For Hire] Junior AI/ML Engineer | RAG · LLMs · FastAPI · Vector DBs | Remote

Posting this for a friend who isn't on Reddit. A recent graduate, entry level, no commercial production experience but spent the past yea...

Reddit - ML Jobs · 1 min ·
Machine Learning

The end of AI

I am a computer science student graduating this year, as far as ai is concerned my knowledge is fairly limited and fairly high level i kn...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime