[2603.01013] Feature-Weighted Maximum Representative Subsampling

[2603.01013] Feature-Weighted Maximum Representative Subsampling

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.01013: Feature-Weighted Maximum Representative Subsampling

Computer Science > Machine Learning arXiv:2603.01013 (cs) [Submitted on 1 Mar 2026] Title:Feature-Weighted Maximum Representative Subsampling Authors:Tony Hauptmann, Stefan Kramer View a PDF of the paper titled Feature-Weighted Maximum Representative Subsampling, by Tony Hauptmann and Stefan Kramer View PDF HTML (experimental) Abstract:In the social sciences, it is often necessary to debias studies and surveys before valid conclusions can be drawn. Debiasing algorithms enable the computational removal of bias using sample weights. However, an issue arises when only a subset of features is highly biased, while the rest is already representative. Algorithms need to strongly alter the sample distribution to manage a few highly biased features, which can in turn introduce bias into already representative variables. To address this issue, we developed a method that uses feature weights to minimize the impact of highly biased features on the computation of sample weights. Our algorithm is based on Maximum Representative Subsampling (MRS), which debiases datasets by aligning a non-representative sample with a representative one through iterative removal of elements to create a representative subsample. The new algorithm, named feature-weighted MRS (FW-MRS), decreases the emphasis on highly biased features, allowing it to retain more instances for downstream tasks. The feature weights are derived from the feature importance of a domain classifier trained to differentiate between t...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min ·
[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min ·
[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models
Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min ·
[2512.08777] Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages
Llms

[2512.08777] Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages

Abstract page for arXiv paper 2512.08777: Fluent Alignment with Disfluent Judges: Post-training for Lower-resource Languages

arXiv - AI · 3 min ·
More in Ai Safety: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime