AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

[2512.21106] Semantic Refinement with LLMs for Graph Representations
Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min ·
[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology
Machine Learning

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

Abstract page for arXiv paper 2511.22294: Structure is Supervision: Multiview Masked Autoencoders for Radiology

arXiv - Machine Learning · 4 min ·
[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models
Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min ·

All Content

[2505.08125] Sharp Gaussian approximations for Decentralized Federated Learning
Machine Learning

[2505.08125] Sharp Gaussian approximations for Decentralized Federated Learning

This paper presents sharp Gaussian approximations for decentralized federated learning, focusing on local SGD's convergence and statistic...

arXiv - Machine Learning · 3 min ·
[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding
Machine Learning

[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding

The SEED metric enhances semantic evaluation in visual brain decoding by integrating multiple metrics, revealing limitations in existing ...

arXiv - Machine Learning · 3 min ·
[2406.10281] Watermarking Language Models with Error Correcting Codes
Llms

[2406.10281] Watermarking Language Models with Error Correcting Codes

The paper presents a novel watermarking framework for language models using error correcting codes, ensuring robust detection of machine-...

arXiv - Machine Learning · 3 min ·
[2510.12402] Cautious Weight Decay
Machine Learning

[2510.12402] Cautious Weight Decay

The paper introduces Cautious Weight Decay (CWD), a novel optimizer-agnostic method that selectively applies weight decay during optimiza...

arXiv - Machine Learning · 3 min ·
[2508.16815] Uncertainty Propagation Networks for Neural Ordinary Differential Equations
Machine Learning

[2508.16815] Uncertainty Propagation Networks for Neural Ordinary Differential Equations

The paper presents Uncertainty Propagation Networks (UPN), a novel approach to neural ordinary differential equations that integrates unc...

arXiv - Machine Learning · 3 min ·
[2503.23434] Towards Trustworthy GUI Agents: A Survey
Llms

[2503.23434] Towards Trustworthy GUI Agents: A Survey

This survey explores the challenges of building trustworthy GUI agents, highlighting the execution gap and proposing a taxonomy for under...

arXiv - Machine Learning · 3 min ·
[2503.12016] A Survey on Federated Fine-tuning of Large Language Models
Llms

[2503.12016] A Survey on Federated Fine-tuning of Large Language Models

This survey explores the integration of Federated Learning with Large Language Models (LLMs), addressing challenges and methodologies for...

arXiv - Machine Learning · 4 min ·
[2602.21160] Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions
Machine Learning

[2602.21160] Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

The paper presents a novel method for decomposing epistemic uncertainty in machine learning models into per-class contributions, enhancin...

arXiv - Machine Learning · 4 min ·
[2602.21036] Empirically Calibrated Conditional Independence Tests
Machine Learning

[2602.21036] Empirically Calibrated Conditional Independence Tests

The paper presents Empirically Calibrated Conditional Independence Tests (ECCIT), a method designed to enhance the reliability of conditi...

arXiv - Machine Learning · 3 min ·
[2602.20805] Assessing the Impact of Speaker Identity in Speech Spoofing Detection
Machine Learning

[2602.20805] Assessing the Impact of Speaker Identity in Speech Spoofing Detection

This paper investigates the influence of speaker identity on speech spoofing detection systems, proposing a framework that integrates spe...

arXiv - Machine Learning · 3 min ·
[2602.20585] Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness
Machine Learning

[2602.20585] Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

This paper explores the conditions under which learning is achievable in online and private settings, focusing on generalized smoothness ...

arXiv - Machine Learning · 4 min ·
[2602.20465] Prior-Agnostic Incentive-Compatible Exploration
Ai Safety

[2602.20465] Prior-Agnostic Incentive-Compatible Exploration

The paper discusses a novel approach to incentive-compatible exploration in bandit settings, addressing the misalignment between principa...

arXiv - Machine Learning · 3 min ·
[2602.20450] Heterogeneity-Aware Client Selection Methodology For Efficient Federated Learning
Machine Learning

[2602.20450] Heterogeneity-Aware Client Selection Methodology For Efficient Federated Learning

The paper presents Terraform, a novel client selection methodology for federated learning that addresses client heterogeneity, achieving ...

arXiv - Machine Learning · 3 min ·
[2602.20383] Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects
Machine Learning

[2602.20383] Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects

The paper discusses the detection and mitigation of group bias in heterogeneous treatment effects (HTEs) using a unified statistical fram...

arXiv - Machine Learning · 4 min ·
[2602.21078] ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning
Machine Learning

[2602.21078] ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning

The article presents ProxyFL, a novel framework for Federated Semi-Supervised Learning (FSSL) that addresses data heterogeneity issues by...

arXiv - Machine Learning · 4 min ·
[2602.20729] Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty
Machine Learning

[2602.20729] Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty

The paper presents Fuz-RL, a fuzzy-guided framework for safe reinforcement learning that addresses uncertainties in real-world applicatio...

arXiv - Machine Learning · 3 min ·
[2602.20629] QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs
Llms

[2602.20629] QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

The paper presents QEDBench, a benchmark for evaluating the alignment of automated systems in assessing university-level mathematical pro...

arXiv - Machine Learning · 4 min ·
[2602.20593] Is the Trigger Essential? A Feature-Based Triggerless Backdoor Attack in Vertical Federated Learning
Machine Learning

[2602.20593] Is the Trigger Essential? A Feature-Based Triggerless Backdoor Attack in Vertical Federated Learning

This paper presents a novel feature-based triggerless backdoor attack in vertical federated learning, demonstrating that triggers are not...

arXiv - Machine Learning · 4 min ·
[2602.20567] Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs
Ai Safety

[2602.20567] Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs

This paper explores the stability and generalization of Push-Sum based decentralized optimization methods over directed graphs, addressin...

arXiv - Machine Learning · 4 min ·
[2602.20549] Sample-efficient evidence estimation of score based priors for model selection
Machine Learning

[2602.20549] Sample-efficient evidence estimation of score based priors for model selection

The paper presents a novel estimator for model evidence in Bayesian inverse problems, particularly using diffusion models, enhancing samp...

arXiv - Machine Learning · 4 min ·
Previous Page 52 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime