AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

[2512.21106] Semantic Refinement with LLMs for Graph Representations
Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min ·
[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology
Machine Learning

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

Abstract page for arXiv paper 2511.22294: Structure is Supervision: Multiview Masked Autoencoders for Radiology

arXiv - Machine Learning · 4 min ·
[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models
Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min ·

All Content

[2502.17028] Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence
Nlp

[2502.17028] Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence

The paper presents CS-Aligner, a novel framework for vision-language alignment that integrates Cauchy-Schwarz divergence with mutual info...

arXiv - Machine Learning · 4 min ·
[2502.12108] Using the Path of Least Resistance to Explain Deep Networks
Machine Learning

[2502.12108] Using the Path of Least Resistance to Explain Deep Networks

The paper introduces Geodesic Integrated Gradients (GIG), a new method for attributing importance scores in deep networks, addressing fla...

arXiv - Machine Learning · 4 min ·
[2501.16613] Safe Reinforcement Learning for Real-World Engine Control
Machine Learning

[2501.16613] Safe Reinforcement Learning for Real-World Engine Control

This article presents a novel toolchain for implementing safe reinforcement learning in real-world engine control, specifically for trans...

arXiv - Machine Learning · 4 min ·
[2309.13411] Towards Attributions of Input Variables in a Coalition
Machine Learning

[2309.13411] Towards Attributions of Input Variables in a Coalition

This paper addresses the challenge of partitioning input variables in attribution methods for Explainable AI, proposing new metrics to re...

arXiv - Machine Learning · 3 min ·
[2512.03005] From Moderation to Mediation: Can LLMs Serve as Mediators in Online Flame Wars?
Llms

[2512.03005] From Moderation to Mediation: Can LLMs Serve as Mediators in Online Flame Wars?

This article explores the potential of large language models (LLMs) to act as mediators in online conflicts, moving beyond moderation to ...

arXiv - AI · 4 min ·
[2510.12462] Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems
Llms

[2510.12462] Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems

This article evaluates biases in Large Language Models (LLMs) used as judges in communication systems, assessing their reliability and pr...

arXiv - AI · 4 min ·
[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments
Llms

[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

This article presents a framework for evaluating AI agent behavior through consumer choice experiments, highlighting biases in decision-m...

arXiv - AI · 4 min ·
[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles
Machine Learning

[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles

The paper presents UbiQTree, a method for decomposing uncertainty in SHAP values used in explainable AI, focusing on aleatoric and episte...

arXiv - AI · 4 min ·
[2506.04500] "Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation
Llms

[2506.04500] "Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

This paper presents STPR, a framework that utilizes large language models to convert complex natural language constraints into executable...

arXiv - AI · 4 min ·
[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Llms

[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

This article presents a novel approach called Reflective Test-Time Planning for embodied LLMs, enabling robots to learn from mistakes thr...

arXiv - Machine Learning · 4 min ·
[2602.21178] XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence
Llms

[2602.21178] XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence

XMorph presents a novel framework for explainable brain tumor analysis, achieving 96% accuracy while addressing interpretability and comp...

arXiv - AI · 3 min ·
[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems
Llms

[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

This study investigates human vulnerability to deception by large language model (LLM) agents, revealing significant trust issues in high...

arXiv - AI · 4 min ·
[2602.21054] VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation
Llms

[2602.21054] VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation

The paper introduces VAUQ, a framework for vision-aware uncertainty quantification in large vision-language models (LVLMs), enhancing sel...

arXiv - AI · 3 min ·
[2602.20981] Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models
Machine Learning

[2602.20981] Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models

This paper presents MMHNet, a novel multimodal hierarchical network that enhances video-to-audio generation by enabling models to general...

arXiv - AI · 4 min ·
[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization
Machine Learning

[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization

This paper explores the relationship between the law of robustness and robust generalization in machine learning, providing a framework t...

arXiv - Machine Learning · 4 min ·
[2602.20958] EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations
Machine Learning

[2602.20958] EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

This paper presents a novel system that integrates depth camera measurements and deep learning for accurate distance estimation in UAV-as...

arXiv - AI · 4 min ·
[2602.20946] Some Simple Economics of AGI
Ai Agents

[2602.20946] Some Simple Economics of AGI

This article explores the economic implications of Artificial General Intelligence (AGI), focusing on the transition from human cognition...

arXiv - Machine Learning · 4 min ·
[2602.20867] SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
Llms

[2602.20867] SoK: Agentic Skills -- Beyond Tool Use in LLM Agents

This paper explores agentic skills in LLM agents, focusing on reusable procedural capabilities that enhance long-horizon workflows. It pr...

arXiv - AI · 4 min ·
[2602.20720] AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs
Llms

[2602.20720] AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs

The paper presents AdapTools, a novel framework for adaptive indirect prompt injection attacks on agentic large language models (LLMs), h...

arXiv - AI · 4 min ·
[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors
Machine Learning

[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors

This paper presents an AI-driven methodology for segmenting straylight effects in space camera sensors, enhancing image analysis in resou...

arXiv - AI · 3 min ·
Previous Page 54 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime