AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min · about 13 hours ago

Machine Learning

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

Abstract page for arXiv paper 2511.22294: Structure is Supervision: Multiview Masked Autoencoders for Radiology

arXiv - Machine Learning · 4 min · about 13 hours ago

Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min · about 13 hours ago

All Content

Nlp

[2502.17028] Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence

The paper presents CS-Aligner, a novel framework for vision-language alignment that integrates Cauchy-Schwarz divergence with mutual info...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2502.12108] Using the Path of Least Resistance to Explain Deep Networks

The paper introduces Geodesic Integrated Gradients (GIG), a new method for attributing importance scores in deep networks, addressing fla...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2501.16613] Safe Reinforcement Learning for Real-World Engine Control

This article presents a novel toolchain for implementing safe reinforcement learning in real-world engine control, specifically for trans...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2309.13411] Towards Attributions of Input Variables in a Coalition

This paper addresses the challenge of partitioning input variables in attribution methods for Explainable AI, proposing new metrics to re...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2512.03005] From Moderation to Mediation: Can LLMs Serve as Mediators in Online Flame Wars?

This article explores the potential of large language models (LLMs) to act as mediators in online conflicts, moving beyond moderation to ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.12462] Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems

This article evaluates biases in Large Language Models (LLMs) used as judges in communication systems, assessing their reliability and pr...

arXiv - AI · 4 min · about 1 month ago

Llms

[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

This article presents a framework for evaluating AI agent behavior through consumer choice experiments, highlighting biases in decision-m...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles

The paper presents UbiQTree, a method for decomposing uncertainty in SHAP values used in explainable AI, focusing on aleatoric and episte...

arXiv - AI · 4 min · about 1 month ago

Llms

[2506.04500] "Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

This paper presents STPR, a framework that utilizes large language models to convert complex natural language constraints into executable...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

This article presents a novel approach called Reflective Test-Time Planning for embodied LLMs, enabling robots to learn from mistakes thr...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.21178] XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence

XMorph presents a novel framework for explainable brain tumor analysis, achieving 96% accuracy while addressing interpretability and comp...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

This study investigates human vulnerability to deception by large language model (LLM) agents, revealing significant trust issues in high...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.21054] VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation

The paper introduces VAUQ, a framework for vision-aware uncertainty quantification in large vision-language models (LVLMs), enhancing sel...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.20981] Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models

This paper presents MMHNet, a novel multimodal hierarchical network that enhances video-to-audio generation by enabling models to general...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization

This paper explores the relationship between the law of robustness and robust generalization in machine learning, providing a framework t...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.20958] EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

This paper presents a novel system that integrates depth camera measurements and deep learning for accurate distance estimation in UAV-as...

arXiv - AI · 4 min · about 1 month ago

Ai Agents

[2602.20946] Some Simple Economics of AGI

This article explores the economic implications of Artificial General Intelligence (AGI), focusing on the transition from human cognition...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.20867] SoK: Agentic Skills -- Beyond Tool Use in LLM Agents

This paper explores agentic skills in LLM agents, focusing on reusable procedural capabilities that enhance long-horizon workflows. It pr...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.20720] AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs

The paper presents AdapTools, a novel framework for adaptive indirect prompt injection attacks on agentic large language models (LLMs), h...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors

This paper presents an AI-driven methodology for segmenting straylight effects in space camera sensors, enhancing image analysis in resou...

arXiv - AI · 3 min · about 1 month ago

Previous Page 54 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2512.21106] Semantic Refinement with LLMs for Graph Representations

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

All Content

[2502.17028] Distributional Vision-Language Alignment by Cauchy-Schwarz Divergence

[2502.12108] Using the Path of Least Resistance to Explain Deep Networks

[2501.16613] Safe Reinforcement Learning for Real-World Engine Control

[2309.13411] Towards Attributions of Input Variables in a Coalition

[2512.03005] From Moderation to Mediation: Can LLMs Serve as Mediators in Online Flame Wars?

[2510.12462] Evaluating and Mitigating LLM-as-a-judge Bias in Communication Systems

[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

[2508.09639] UbiQTree: Uncertainty Quantification in XAI with Tree Ensembles

[2506.04500] "Don't Do That!": Guiding Embodied Systems through Large Language Model-based Constraint Generation

[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

[2602.21178] XMorph: Explainable Brain Tumor Analysis Via LLM-Assisted Hybrid Deep Intelligence

[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

[2602.21054] VAUQ: Vision-Aware Uncertainty Quantification for LVLM Self-Evaluation

[2602.20981] Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models

[2602.20971] Does Order Matter : Connecting The Law of Robustness to Robust Generalization

[2602.20958] EKF-Based Depth Camera and Deep Learning Fusion for UAV-Person Distance Estimation and Following in SAR Operations

[2602.20946] Some Simple Economics of AGI

[2602.20867] SoK: Agentic Skills -- Beyond Tool Use in LLM Agents

[2602.20720] AdapTools: Adaptive Tool-based Indirect Prompt Injection Attacks on Agentic LLMs

[2602.20709] Onboard-Targeted Segmentation of Straylight in Space Camera Sensors

Related Topics

Stay updated with AI News