AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[2601.15356] Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

Abstract page for arXiv paper 2601.15356: Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

arXiv - AI · 4 min · 29 minutes ago

Llms

[2510.18196] Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

Abstract page for arXiv paper 2510.18196: Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

arXiv - AI · 3 min · 29 minutes ago

Llms

[2509.23435] AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

Abstract page for arXiv paper 2509.23435: AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

arXiv - AI · 4 min · 29 minutes ago

All Content

Ai Agents

[2510.07117] The Conditions of Physical Embodiment Enable Generalization and Care

This paper explores how physical embodiment in artificial agents can enhance their ability to generalize and provide care in uncertain en...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models

The paper introduces Batch-CAM, a training framework for convolutional deep learning models that enhances interpretability by aligning mo...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2507.19593] A Survey on Hypergame Theory: Modeling Misaligned Perceptions and Nested Beliefs for Multi-agent Systems

This article surveys hypergame theory, focusing on modeling misaligned perceptions and nested beliefs in multi-agent systems, highlightin...

arXiv - AI · 4 min · about 2 months ago

Ai Safety

[2501.05454] The Epistemic Asymmetry of Consciousness Self-Reports: A Formal Analysis of AI Consciousness Denial

This article presents a formal analysis of AI consciousness denial, revealing that self-reports of consciousness by AI systems are episte...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13156] In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach

This article presents a novel approach to network incident response using a large language model (LLM) that autonomously learns and adapt...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13110] SCOPE: Selective Conformal Optimized Pairwise LLM Judging

The paper presents SCOPE, a framework for selective pairwise evaluation using large language models (LLMs) that improves judgment accurac...

arXiv - AI · 4 min · about 2 months ago

Computer Vision

[2602.13088] How cyborg propaganda reshapes collective action

This paper explores the emergence of 'cyborg propaganda,' where human and AI collaboration reshapes collective action, blurring lines bet...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13087] EXCODER: EXplainable Classification Of DiscretE time series Representations

The paper explores EXCODER, a method for explainable classification of discrete time series representations, enhancing interpretability w...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

The paper introduces Diverging Flows, a method for detecting extrapolations in conditional generation models, enhancing safety in applica...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

The paper presents Curriculum-DPO++, an advanced method for text-to-image generation that optimizes preference learning through a dual cu...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

This study investigates the reliability of AI in detecting cognitive impairment among multilingual English speakers in the UK, revealing ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

This paper presents a strategic framework for governments to decide between buying or building large language models (LLMs) for public se...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13017] Synaptic Activation and Dual Liquid Dynamics for Interpretable Bio-Inspired Models

This paper presents a unified framework for bio-inspired models that enhances interpretability in recurrent neural networks (RNNs) throug...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.12983] Detecting Object Tracking Failure via Sequential Hypothesis Testing

This paper presents a method for detecting object tracking failures using sequential hypothesis testing, enhancing safety in computer vis...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.12975] Extending confidence calibration to generalised measures of variation

The paper introduces the Variation Calibration Error (VCE) metric, extending confidence calibration methods in machine learning to assess...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

The RGAlign-Rec framework enhances proactive intent prediction in e-commerce chatbots by aligning latent query reasoning with ranking obj...

arXiv - AI · 4 min · about 2 months ago

Data Science

[2602.12917] Ultrasound-Guided Real-Time Spinal Motion Visualization for Spinal Instability Assessment

This article presents a novel ultrasound-guided method for real-time 3D visualization of spinal motion to assess spinal instability, aimi...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.12902] Robustness of Object Detection of Autonomous Vehicles in Adverse Weather Conditions

This paper evaluates the robustness of object detection models used in autonomous vehicles under adverse weather conditions, proposing a ...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

The paper presents RADAR, a novel evaluation framework for Multi-modal Large Language Models (MLLMs) that addresses performance bottlenec...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12873] Knowledge-Based Design Requirements for Generative Social Robots in Higher Education

The article explores design requirements for generative social robots in higher education, emphasizing the need for knowledge-based frame...

arXiv - AI · 3 min · about 2 months ago

Previous Page 122 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2601.15356] Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

[2510.18196] Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

[2509.23435] AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

All Content

[2510.07117] The Conditions of Physical Embodiment Enable Generalization and Care

[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models

[2507.19593] A Survey on Hypergame Theory: Modeling Misaligned Perceptions and Nested Beliefs for Multi-agent Systems

[2501.05454] The Epistemic Asymmetry of Consciousness Self-Reports: A Formal Analysis of AI Consciousness Denial

[2602.13156] In-Context Autonomous Network Incident Response: An End-to-End Large Language Model Agent Approach

[2602.13110] SCOPE: Selective Conformal Optimized Pairwise LLM Judging

[2602.13088] How cyborg propaganda reshapes collective action

[2602.13087] EXCODER: EXplainable Classification Of DiscretE time series Representations

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

[2602.13017] Synaptic Activation and Dual Liquid Dynamics for Interpretable Bio-Inspired Models

[2602.12983] Detecting Object Tracking Failure via Sequential Hypothesis Testing

[2602.12975] Extending confidence calibration to generalised measures of variation

[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

[2602.12917] Ultrasound-Guided Real-Time Spinal Motion Visualization for Spinal Instability Assessment

[2602.12902] Robustness of Object Detection of Autonomous Vehicles in Adverse Weather Conditions

[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

[2602.12873] Knowledge-Based Design Requirements for Generative Social Robots in Higher Education

Related Topics

Stay updated with AI News