AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

Washington needs AI guardrails — now | Opinion
Ai Safety

Washington needs AI guardrails — now | Opinion

We need legislation that draws clear lines on what AI systems may and may not do on behalf of the United States government

AI Tools & Products · 3 min ·
[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
Ai Safety

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

Abstract page for arXiv paper 2601.12910: SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

arXiv - AI · 3 min ·
[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining
Machine Learning

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

Abstract page for arXiv paper 2509.21385: Debugging Concept Bottleneck Models through Removal and Retraining

arXiv - Machine Learning · 4 min ·

All Content

Washington needs AI guardrails — now | Opinion
Ai Safety

Washington needs AI guardrails — now | Opinion

We need legislation that draws clear lines on what AI systems may and may not do on behalf of the United States government

AI Tools & Products · 3 min ·
[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
Ai Safety

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

Abstract page for arXiv paper 2601.12910: SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

arXiv - AI · 3 min ·
[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining
Machine Learning

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

Abstract page for arXiv paper 2509.21385: Debugging Concept Bottleneck Models through Removal and Retraining

arXiv - Machine Learning · 4 min ·
[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval
Llms

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

Abstract page for arXiv paper 2512.00804: Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

arXiv - AI · 4 min ·
[2509.24296] DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models
Llms

[2509.24296] DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

Abstract page for arXiv paper 2509.24296: DiffuGuard: How Intrinsic Safety is Lost and Found in Diffusion Large Language Models

arXiv - AI · 4 min ·
[2410.13874] Chain-Oriented Objective Logic with Neural Network Feedback Control and Cascade Filtering for Dynamic Multi-DSL Regulation
Machine Learning

[2410.13874] Chain-Oriented Objective Logic with Neural Network Feedback Control and Cascade Filtering for Dynamic Multi-DSL Regulation

Abstract page for arXiv paper 2410.13874: Chain-Oriented Objective Logic with Neural Network Feedback Control and Cascade Filtering for D...

arXiv - Machine Learning · 4 min ·
[2404.05290] MindSet: Vision. A toolbox for testing DNNs on key psychological experiments
Machine Learning

[2404.05290] MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

Abstract page for arXiv paper 2404.05290: MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

arXiv - AI · 4 min ·
[2512.07885] ByteStorm: a multi-step data-driven approach for Tropical Cyclones detection and tracking
Ai Safety

[2512.07885] ByteStorm: a multi-step data-driven approach for Tropical Cyclones detection and tracking

Abstract page for arXiv paper 2512.07885: ByteStorm: a multi-step data-driven approach for Tropical Cyclones detection and tracking

arXiv - Machine Learning · 4 min ·
[2511.16992] FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models
Llms

[2511.16992] FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models

Abstract page for arXiv paper 2511.16992: FIRM: Federated In-client Regularized Multi-objective Alignment for Large Language Models

arXiv - Machine Learning · 4 min ·
[2509.15199] CausalPre: Scalable and Effective Data Pre-Processing for Causal Fairness
Machine Learning

[2509.15199] CausalPre: Scalable and Effective Data Pre-Processing for Causal Fairness

Abstract page for arXiv paper 2509.15199: CausalPre: Scalable and Effective Data Pre-Processing for Causal Fairness

arXiv - Machine Learning · 4 min ·
[2603.25613] Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verification
Llms

[2603.25613] Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verification

Abstract page for arXiv paper 2603.25613: Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verif...

arXiv - AI · 4 min ·
[2603.25740] Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving
Machine Learning

[2603.25740] Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

Abstract page for arXiv paper 2603.25740: Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving

arXiv - Machine Learning · 4 min ·
[2603.25423] From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunking in the Wild
Machine Learning

[2603.25423] From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunking in the Wild

Abstract page for arXiv paper 2603.25423: From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunki...

arXiv - AI · 4 min ·
[2603.25466] Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation
Machine Learning

[2603.25466] Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation

Abstract page for arXiv paper 2603.25466: Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation

arXiv - Machine Learning · 3 min ·
[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment
Llms

[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment

Abstract page for arXiv paper 2603.25145: Learning to Rank Caption Chains for Video-Text Alignment

arXiv - Machine Learning · 3 min ·
[2603.25150] Goodness-of-pronunciation without phoneme time alignment
Machine Learning

[2603.25150] Goodness-of-pronunciation without phoneme time alignment

Abstract page for arXiv paper 2603.25150: Goodness-of-pronunciation without phoneme time alignment

arXiv - Machine Learning · 3 min ·
[2603.25140] SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment
Robotics

[2603.25140] SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment

Abstract page for arXiv paper 2603.25140: SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-vis...

arXiv - Machine Learning · 3 min ·
[2603.24986] Rethinking Health Agents: From Siloed AI to Collaborative Decision Mediators
Llms

[2603.24986] Rethinking Health Agents: From Siloed AI to Collaborative Decision Mediators

Abstract page for arXiv paper 2603.24986: Rethinking Health Agents: From Siloed AI to Collaborative Decision Mediators

arXiv - AI · 3 min ·
[2603.24965] Self-Corrected Image Generation with Explainable Latent Rewards
Generative Ai

[2603.24965] Self-Corrected Image Generation with Explainable Latent Rewards

Abstract page for arXiv paper 2603.24965: Self-Corrected Image Generation with Explainable Latent Rewards

arXiv - AI · 3 min ·
[2603.24914] Shaping the Future of Mathematics in the Age of AI
Ai Safety

[2603.24914] Shaping the Future of Mathematics in the Age of AI

Abstract page for arXiv paper 2603.24914: Shaping the Future of Mathematics in the Age of AI

arXiv - AI · 3 min ·
Page 1 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime