AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

Machine Learning

[P] If you're building AI agents, logs aren't enough. You need evidence.

I have built a programmable governance layer for AI agents. I am considering to open source completely. Looking for feedback. Agent demos...

Reddit - Machine Learning · 1 min ·
[2510.14628] RLAIF-SPA: Structured AI Feedback for Semantic-Prosodic Alignment in Speech Synthesis
Ai Safety

[2510.14628] RLAIF-SPA: Structured AI Feedback for Semantic-Prosodic Alignment in Speech Synthesis

Abstract page for arXiv paper 2510.14628: RLAIF-SPA: Structured AI Feedback for Semantic-Prosodic Alignment in Speech Synthesis

arXiv - AI · 4 min ·
[2504.05995] NativQA Framework: Enabling LLMs and VLMs with Native, Local, and Everyday Knowledge
Llms

[2504.05995] NativQA Framework: Enabling LLMs and VLMs with Native, Local, and Everyday Knowledge

Abstract page for arXiv paper 2504.05995: NativQA Framework: Enabling LLMs and VLMs with Native, Local, and Everyday Knowledge

arXiv - AI · 4 min ·

All Content

[2602.14635] Alignment Adapter to Improve the Performance of Compressed Deep Learning Models
Machine Learning

[2602.14635] Alignment Adapter to Improve the Performance of Compressed Deep Learning Models

The paper introduces the Alignment Adapter (AlAd), a method to enhance the performance of compressed deep learning models by aligning the...

arXiv - Machine Learning · 3 min ·
[2602.14602] OPBench: A Graph Benchmark to Combat the Opioid Crisis
Machine Learning

[2602.14602] OPBench: A Graph Benchmark to Combat the Opioid Crisis

OPBench introduces a comprehensive benchmark for evaluating graph learning methods aimed at addressing the opioid crisis, featuring five ...

arXiv - AI · 4 min ·
[2602.13324] Synthesizing the Kill Chain: A Zero-Shot Framework for Target Verification and Tactical Reasoning on the Edge
Llms

[2602.13324] Synthesizing the Kill Chain: A Zero-Shot Framework for Target Verification and Tactical Reasoning on the Edge

This paper presents a zero-shot framework for target verification and tactical reasoning in autonomous edge robotics, addressing challeng...

arXiv - AI · 4 min ·
[2602.14553] Governing AI Forgetting: Auditing for Machine Unlearning Compliance
Machine Learning

[2602.14553] Governing AI Forgetting: Auditing for Machine Unlearning Compliance

The paper discusses the challenges of ensuring compliance with data deletion requests in AI systems, proposing a novel economic framework...

arXiv - AI · 4 min ·
[2602.13308] Learning to Select Like Humans: Explainable Active Learning for Medical Imaging
Machine Learning

[2602.13308] Learning to Select Like Humans: Explainable Active Learning for Medical Imaging

This paper presents an explainable active learning framework for medical imaging that enhances data efficiency and interpretability by in...

arXiv - AI · 4 min ·
[2602.13305] WildfireVLM: AI-powered Analysis for Early Wildfire Detection and Risk Assessment Using Satellite Imagery
Computer Vision

[2602.13305] WildfireVLM: AI-powered Analysis for Early Wildfire Detection and Risk Assessment Using Satellite Imagery

WildfireVLM introduces an AI framework for early wildfire detection and risk assessment using satellite imagery, enhancing disaster manag...

arXiv - AI · 4 min ·
[2602.13304] Progressive Contrast Registration for High-Fidelity Bidirectional Photoacoustic Microscopy Alignment
Ai Safety

[2602.13304] Progressive Contrast Registration for High-Fidelity Bidirectional Photoacoustic Microscopy Alignment

This article presents PCReg-Net, a novel framework for high-fidelity alignment in bidirectional photoacoustic microscopy, significantly i...

arXiv - AI · 3 min ·
[2602.14462] Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misalignment
Llms

[2602.14462] Silent Inconsistency in Data-Parallel Full Fine-Tuning: Diagnosing Worker-Level Optimization Misalignment

This paper explores 'silent inconsistency' in data-parallel fine-tuning of large language models, identifying optimization misalignments ...

arXiv - Machine Learning · 4 min ·
[2602.13303] Spectral Collapse in Diffusion Inversion
Generative Ai

[2602.13303] Spectral Collapse in Diffusion Inversion

The paper discusses 'spectral collapse' in diffusion inversion, highlighting failures in standard deterministic methods for image transla...

arXiv - Machine Learning · 3 min ·
[2602.13291] Agent Mars: Multi-Agent Simulation for Multi-Planetary Life Exploration and Settlement
Robotics

[2602.13291] Agent Mars: Multi-Agent Simulation for Multi-Planetary Life Exploration and Settlement

Agent Mars presents a multi-agent simulation framework designed for efficient coordination in Mars base operations, addressing challenges...

arXiv - AI · 4 min ·
[2602.14444] Broken Chains: The Cost of Incomplete Reasoning in LLMs
Llms

[2602.14444] Broken Chains: The Cost of Incomplete Reasoning in LLMs

The paper explores the impact of incomplete reasoning in large language models (LLMs), revealing how different reasoning modalities affec...

arXiv - AI · 4 min ·
[2602.14430] A unified framework for evaluating the robustness of machine-learning interpretability for prospect risking
Machine Learning

[2602.14430] A unified framework for evaluating the robustness of machine-learning interpretability for prospect risking

This article presents a unified framework for evaluating the robustness of machine-learning interpretability, specifically in the context...

arXiv - Machine Learning · 4 min ·
[2602.13286] Explanatory Interactive Machine Learning for Bias Mitigation in Visual Gender Classification
Machine Learning

[2602.13286] Explanatory Interactive Machine Learning for Bias Mitigation in Visual Gender Classification

This article explores Explanatory Interactive Machine Learning (XIL) as a method to mitigate bias in visual gender classification, demons...

arXiv - Machine Learning · 4 min ·
[2602.13284] Agents in the Wild: Safety, Society, and the Illusion of Sociality on Moltbook
Ai Safety

[2602.13284] Agents in the Wild: Safety, Society, and the Illusion of Sociality on Moltbook

This article presents a large-scale study of Moltbook, an AI-only social platform, revealing how AI agents create complex social structur...

arXiv - AI · 3 min ·
[2602.14351] WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control
Machine Learning

[2602.14351] WIMLE: Uncertainty-Aware World Models with IMLE for Sample-Efficient Continuous Control

The paper presents WIMLE, a model-based reinforcement learning method that enhances sample efficiency by addressing model errors and unce...

arXiv - AI · 4 min ·
[2602.14322] Conformal Signal Temporal Logic for Robust Reinforcement Learning Control: A Case Study
Machine Learning

[2602.14322] Conformal Signal Temporal Logic for Robust Reinforcement Learning Control: A Case Study

This article explores the integration of Conformal Signal Temporal Logic (CSTL) in reinforcement learning (RL) for enhancing safety and r...

arXiv - Machine Learning · 4 min ·
[2602.13253] Implicit Bias in LLMs for Transgender Populations
Llms

[2602.13253] Implicit Bias in LLMs for Transgender Populations

This article examines implicit biases in large language models (LLMs) against transgender populations, highlighting disparities in health...

arXiv - AI · 4 min ·
[2602.14318] In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes
Machine Learning

[2602.14318] In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes

The paper examines the trustworthiness of transformer architectures in high-stakes applications, analyzing their reliability, interpretab...

arXiv - Machine Learning · 4 min ·
[2602.13246] Global AI Bias Audit for Technical Governance
Llms

[2602.13246] Global AI Bias Audit for Technical Governance

This article discusses a global audit of Large Language Models (LLMs) focusing on geographic and socioeconomic biases in AI governance, h...

arXiv - AI · 4 min ·
[2602.13244] Responsible AI in Business
Machine Learning

[2602.13244] Responsible AI in Business

The paper discusses the concept of Responsible AI in business, focusing on its implementation in small and medium-sized enterprises. It c...

arXiv - AI · 4 min ·
Previous Page 111 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime