AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min · about 10 hours ago

Machine Learning

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

Abstract page for arXiv paper 2511.22294: Structure is Supervision: Multiview Masked Autoencoders for Radiology

arXiv - Machine Learning · 4 min · about 10 hours ago

Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min · about 10 hours ago

All Content

Machine Learning

[2505.08125] Sharp Gaussian approximations for Decentralized Federated Learning

This paper presents sharp Gaussian approximations for decentralized federated learning, focusing on local SGD's convergence and statistic...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding

The SEED metric enhances semantic evaluation in visual brain decoding by integrating multiple metrics, revealing limitations in existing ...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2406.10281] Watermarking Language Models with Error Correcting Codes

The paper presents a novel watermarking framework for language models using error correcting codes, ensuring robust detection of machine-...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2510.12402] Cautious Weight Decay

The paper introduces Cautious Weight Decay (CWD), a novel optimizer-agnostic method that selectively applies weight decay during optimiza...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2508.16815] Uncertainty Propagation Networks for Neural Ordinary Differential Equations

The paper presents Uncertainty Propagation Networks (UPN), a novel approach to neural ordinary differential equations that integrates unc...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2503.23434] Towards Trustworthy GUI Agents: A Survey

This survey explores the challenges of building trustworthy GUI agents, highlighting the execution gap and proposing a taxonomy for under...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2503.12016] A Survey on Federated Fine-tuning of Large Language Models

This survey explores the integration of Federated Learning with Large Language Models (LLMs), addressing challenges and methodologies for...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21160] Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

The paper presents a novel method for decomposing epistemic uncertainty in machine learning models into per-class contributions, enhancin...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21036] Empirically Calibrated Conditional Independence Tests

The paper presents Empirically Calibrated Conditional Independence Tests (ECCIT), a method designed to enhance the reliability of conditi...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.20805] Assessing the Impact of Speaker Identity in Speech Spoofing Detection

This paper investigates the influence of speaker identity on speech spoofing detection systems, proposing a framework that integrates spe...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.20585] Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

This paper explores the conditions under which learning is achievable in online and private settings, focusing on generalized smoothness ...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Safety

[2602.20465] Prior-Agnostic Incentive-Compatible Exploration

The paper discusses a novel approach to incentive-compatible exploration in bandit settings, addressing the misalignment between principa...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.20450] Heterogeneity-Aware Client Selection Methodology For Efficient Federated Learning

The paper presents Terraform, a novel client selection methodology for federated learning that addresses client heterogeneity, achieving ...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.20383] Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects

The paper discusses the detection and mitigation of group bias in heterogeneous treatment effects (HTEs) using a unified statistical fram...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21078] ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning

The article presents ProxyFL, a novel framework for Federated Semi-Supervised Learning (FSSL) that addresses data heterogeneity issues by...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20729] Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty

The paper presents Fuz-RL, a fuzzy-guided framework for safe reinforcement learning that addresses uncertainties in real-world applicatio...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.20629] QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

The paper presents QEDBench, a benchmark for evaluating the alignment of automated systems in assessing university-level mathematical pro...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20593] Is the Trigger Essential? A Feature-Based Triggerless Backdoor Attack in Vertical Federated Learning

This paper presents a novel feature-based triggerless backdoor attack in vertical federated learning, demonstrating that triggers are not...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Safety

[2602.20567] Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs

This paper explores the stability and generalization of Push-Sum based decentralized optimization methods over directed graphs, addressin...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20549] Sample-efficient evidence estimation of score based priors for model selection

The paper presents a novel estimator for model evidence in Bayesian inverse problems, particularly using diffusion models, enhancing samp...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 52 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2512.21106] Semantic Refinement with LLMs for Graph Representations

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

All Content

[2505.08125] Sharp Gaussian approximations for Decentralized Federated Learning

[2503.06437] SEED: Towards More Accurate Semantic Evaluation for Visual Brain Decoding

[2406.10281] Watermarking Language Models with Error Correcting Codes

[2510.12402] Cautious Weight Decay

[2508.16815] Uncertainty Propagation Networks for Neural Ordinary Differential Equations

[2503.23434] Towards Trustworthy GUI Agents: A Survey

[2503.12016] A Survey on Federated Fine-tuning of Large Language Models

[2602.21160] Not Just How Much, But Where: Decomposing Epistemic Uncertainty into Per-Class Contributions

[2602.21036] Empirically Calibrated Conditional Independence Tests

[2602.20805] Assessing the Impact of Speaker Identity in Speech Spoofing Detection

[2602.20585] Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness

[2602.20465] Prior-Agnostic Incentive-Compatible Exploration

[2602.20450] Heterogeneity-Aware Client Selection Methodology For Efficient Federated Learning

[2602.20383] Detecting and Mitigating Group Bias in Heterogeneous Treatment Effects

[2602.21078] ProxyFL: A Proxy-Guided Framework for Federated Semi-Supervised Learning

[2602.20729] Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty

[2602.20629] QEDBENCH: Quantifying the Alignment Gap in Automated Evaluation of University-Level Mathematical Proofs

[2602.20593] Is the Trigger Essential? A Feature-Based Triggerless Backdoor Attack in Vertical Federated Learning

[2602.20567] Stability and Generalization of Push-Sum Based Decentralized Optimization over Directed Graphs

[2602.20549] Sample-efficient evidence estimation of score based priors for model selection

Related Topics

Stay updated with AI News