AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min · about 4 hours ago

Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min · about 4 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 4 hours ago

All Content

Machine Learning

[2603.02475] Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild

Abstract page for arXiv paper 2603.02475: Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2603.02767] ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion

Abstract page for arXiv paper 2603.02767: ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion

arXiv - AI · 3 min · 26 days ago

Llms

[2603.02676] ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs

Abstract page for arXiv paper 2603.02676: ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs

arXiv - AI · 3 min · 26 days ago

Robotics

[2603.02640] Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals

Abstract page for arXiv paper 2603.02640: Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02259] The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety

Abstract page for arXiv paper 2603.02259: The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety

arXiv - Machine Learning · 4 min · 26 days ago

Llms

[2603.02557] CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment

Abstract page for arXiv paper 2603.02557: CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.03226] Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective

Abstract page for arXiv paper 2603.03226: Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective

arXiv - Machine Learning · 3 min · 26 days ago

Llms

[2603.03206] Understanding and Mitigating Dataset Corruption in LLM Steering

Abstract page for arXiv paper 2603.03206: Understanding and Mitigating Dataset Corruption in LLM Steering

arXiv - AI · 4 min · 26 days ago

Llms

[2603.02420] Slurry-as-a-Service: A Modest Proposal on Scalable Pluralistic Alignment for Nutrient Optimization

Abstract page for arXiv paper 2603.02420: Slurry-as-a-Service: A Modest Proposal on Scalable Pluralistic Alignment for Nutrient Optimization

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.03106] Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection

Abstract page for arXiv paper 2603.03106: Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection

arXiv - AI · 3 min · 26 days ago

Machine Learning

[2603.03022] SEHFS: Structural Entropy-Guided High-Order Correlation Learning for Multi-View Multi-Label Feature Selection

Abstract page for arXiv paper 2603.03022: SEHFS: Structural Entropy-Guided High-Order Correlation Learning for Multi-View Multi-Label Fea...

arXiv - Machine Learning · 4 min · 26 days ago

Ai Safety

[2603.03007] Breaking the Prototype Bias Loop: Confidence-Aware Federated Contrastive Learning for Highly Imbalanced Clients

Abstract page for arXiv paper 2603.03007: Breaking the Prototype Bias Loop: Confidence-Aware Federated Contrastive Learning for Highly Im...

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2603.02957] Leveraging Label Proportion Prior for Class-Imbalanced Semi-Supervised Learning

Abstract page for arXiv paper 2603.02957: Leveraging Label Proportion Prior for Class-Imbalanced Semi-Supervised Learning

arXiv - Machine Learning · 3 min · 26 days ago

Llms

[2603.02938] Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

Abstract page for arXiv paper 2603.02938: Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large L...

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02934] On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Behavioral Learning

Abstract page for arXiv paper 2603.02934: On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Beha...

arXiv - AI · 4 min · 26 days ago

Ai Safety

[2603.02846] Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling

Abstract page for arXiv paper 2603.02846: Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02765] Next Embedding Prediction Makes World Models Stronger

Abstract page for arXiv paper 2603.02765: Next Embedding Prediction Makes World Models Stronger

arXiv - AI · 3 min · 26 days ago

Ai Safety

[2603.02756] Rethinking Time Series Domain Generalization via Structure-Stratified Calibration

Abstract page for arXiv paper 2603.02756: Rethinking Time Series Domain Generalization via Structure-Stratified Calibration

arXiv - Machine Learning · 3 min · 26 days ago

Machine Learning

[2603.02212] GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning

Abstract page for arXiv paper 2603.02212: GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning

arXiv - AI · 3 min · 26 days ago

Llms

[2603.02675] From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

Abstract page for arXiv paper 2603.02675: From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

arXiv - Machine Learning · 3 min · 26 days ago

Previous Page 21 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

All Content

[2603.02475] Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild

[2603.02767] ITO: Images and Texts as One via Synergizing Multiple Alignment and Training-Time Fusion

[2603.02676] ITLC at SemEval-2026 Task 11: Normalization and Deterministic Parsing for Formal Reasoning in LLMs

[2603.02640] Credibility Governance: A Social Mechanism for Collective Self-Correction under Weak Truth Signals

[2603.02259] The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety

[2603.02557] CAPT: Confusion-Aware Prompt Tuning for Reducing Vision-Language Misalignment

[2603.03226] Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective

[2603.03206] Understanding and Mitigating Dataset Corruption in LLM Steering

[2603.02420] Slurry-as-a-Service: A Modest Proposal on Scalable Pluralistic Alignment for Nutrient Optimization

[2603.03106] Multi-Scale Adaptive Neighborhood Awareness Transformer For Graph Fraud Detection

[2603.03022] SEHFS: Structural Entropy-Guided High-Order Correlation Learning for Multi-View Multi-Label Feature Selection

[2603.03007] Breaking the Prototype Bias Loop: Confidence-Aware Federated Contrastive Learning for Highly Imbalanced Clients

[2603.02957] Leveraging Label Proportion Prior for Class-Imbalanced Semi-Supervised Learning

[2603.02938] Beyond One-Size-Fits-All: Adaptive Subgraph Denoising for Zero-Shot Graph Learning with Large Language Models

[2603.02934] On the Structural Limitations of Weight-Based Neural Adaptation and the Role of Reversible Behavioral Learning

[2603.02846] Learning Memory-Enhanced Improvement Heuristics for Flexible Job Shop Scheduling

[2603.02765] Next Embedding Prediction Makes World Models Stronger

[2603.02756] Rethinking Time Series Domain Generalization via Structure-Stratified Calibration

[2603.02212] GLEAN: Grounded Lightweight Evaluation Anchors for Contamination-Aware Tabular Reasoning

[2603.02675] From Shallow to Deep: Pinning Semantic Intent via Causal GRPO

Related Topics

Stay updated with AI News