AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min · about 5 hours ago

Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min · about 5 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 5 hours ago

All Content

Llms

[2504.21023] Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

Abstract page for arXiv paper 2504.21023: Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

arXiv - AI · 4 min · 26 days ago

Llms

[2603.02635] SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

Abstract page for arXiv paper 2603.02635: SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

arXiv - Machine Learning · 4 min · 26 days ago

Llms

[2603.03242] Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

Abstract page for arXiv paper 2603.03242: Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

arXiv - AI · 4 min · 26 days ago

Ai Safety

[2603.02622] Implicit Bias in Deep Linear Discriminant Analysis

Abstract page for arXiv paper 2603.02622: Implicit Bias in Deep Linear Discriminant Analysis

arXiv - Machine Learning · 3 min · 26 days ago

Machine Learning

[2603.03119] AI Space Physics: Constitutive boundary semantics for open AI institutions

Abstract page for arXiv paper 2603.03119: AI Space Physics: Constitutive boundary semantics for open AI institutions

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02525] Thermodynamic Regulation of Finite-Time Gibbs Training in Energy-Based Models: A Restricted Boltzmann Machine Study

Abstract page for arXiv paper 2603.02525: Thermodynamic Regulation of Finite-Time Gibbs Training in Energy-Based Models: A Restricted Bol...

arXiv - Machine Learning · 4 min · 26 days ago

Llms

[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

Abstract page for arXiv paper 2603.02482: MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2603.02348] Diffusion-MPC in Discrete Domains: Feasibility Constraints, Horizon Effects, and Critic Alignment: Case study with Tetris

Abstract page for arXiv paper 2603.02348: Diffusion-MPC in Discrete Domains: Feasibility Constraints, Horizon Effects, and Critic Alignme...

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02337] Preconditioned Score and Flow Matching

Abstract page for arXiv paper 2603.02337: Preconditioned Score and Flow Matching

arXiv - AI · 3 min · 26 days ago

Machine Learning

[2603.02711] A Natural Language Agentic Approach to Study Affective Polarization

Abstract page for arXiv paper 2603.02711: A Natural Language Agentic Approach to Study Affective Polarization

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2603.02280] Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

Abstract page for arXiv paper 2603.02280: Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

arXiv - AI · 4 min · 26 days ago

Ai Safety

AI companies are spending millions to thwart this former tech exec’s congressional bid | TechCrunch

A tech billionaire-backed super PAC is spending $125 million to undercut candidates pushing for AI regulation. New York's Alex Bores, a f...

TechCrunch - AI · 7 min · 27 days ago

Llms

[2510.20095] BioCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models

Abstract page for arXiv paper 2510.20095: BioCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models

arXiv - Machine Learning · 4 min · 27 days ago

Machine Learning

[2509.12490] SamudrACE: Fast and Accurate Coupled Climate Modeling with 3D Ocean and Atmosphere Emulators

Abstract page for arXiv paper 2509.12490: SamudrACE: Fast and Accurate Coupled Climate Modeling with 3D Ocean and Atmosphere Emulators

arXiv - Machine Learning · 4 min · 27 days ago

Ai Safety

[2410.04264] Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

Abstract page for arXiv paper 2410.04264: Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

arXiv - Machine Learning · 3 min · 27 days ago

Llms

[2602.00428] When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

Abstract page for arXiv paper 2602.00428: When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent S...

arXiv - AI · 4 min · 27 days ago

Llms

[2602.02742] Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

Abstract page for arXiv paper 2602.02742: Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

arXiv - Machine Learning · 3 min · 27 days ago

Llms

[2601.20838] Reward Models Inherit Value Biases from Pretraining

Abstract page for arXiv paper 2601.20838: Reward Models Inherit Value Biases from Pretraining

arXiv - Machine Learning · 4 min · 27 days ago

Machine Learning

[2601.04378] Aligned explanations in neural networks

Abstract page for arXiv paper 2601.04378: Aligned explanations in neural networks

arXiv - Machine Learning · 3 min · 27 days ago

Generative Ai

[2510.26818] GACA-DiT: Diffusion-based Dance-to-Music Generation with Genre-Adaptive Rhythm and Context-Aware Alignment

Abstract page for arXiv paper 2510.26818: GACA-DiT: Diffusion-based Dance-to-Music Generation with Genre-Adaptive Rhythm and Context-Awar...

arXiv - AI · 4 min · 27 days ago

Previous Page 22 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

All Content

[2504.21023] Param$Δ$ for Direct Weight Mixing: Post-Train Large Language Model at Zero Cost

[2603.02635] SaFeR-ToolKit: Structured Reasoning via Virtual Tool Calling for Multimodal Safety

[2603.03242] Density-Guided Response Optimization: Community-Grounded Alignment via Implicit Acceptance Signals

[2603.02622] Implicit Bias in Deep Linear Discriminant Analysis

[2603.03119] AI Space Physics: Constitutive boundary semantics for open AI institutions

[2603.02525] Thermodynamic Regulation of Finite-Time Gibbs Training in Energy-Based Models: A Restricted Boltzmann Machine Study

[2603.02482] MUSE: A Run-Centric Platform for Multimodal Unified Safety Evaluation of Large Language Models

[2603.02348] Diffusion-MPC in Discrete Domains: Feasibility Constraints, Horizon Effects, and Critic Alignment: Case study with Tetris

[2603.02337] Preconditioned Score and Flow Matching

[2603.02711] A Natural Language Agentic Approach to Study Affective Polarization

[2603.02280] Temporal Imbalance of Positive and Negative Supervision in Class-Incremental Learning

AI companies are spending millions to thwart this former tech exec’s congressional bid | TechCrunch

[2510.20095] BioCAP: Exploiting Synthetic Captions Beyond Labels in Biological Foundation Models

[2509.12490] SamudrACE: Fast and Accurate Coupled Climate Modeling with 3D Ocean and Atmosphere Emulators

[2410.04264] Decoupling Dynamical Richness from Representation Learning: Towards Practical Measurement

[2602.00428] When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

[2602.02742] Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

[2601.20838] Reward Models Inherit Value Biases from Pretraining

[2601.04378] Aligned explanations in neural networks

[2510.26818] GACA-DiT: Diffusion-based Dance-to-Music Generation with Genre-Adaptive Rhythm and Context-Aware Alignment

Related Topics

Stay updated with AI News