AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization
Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min ·
[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min ·
[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models
Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min ·

All Content

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch
Ai Safety

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch

Anthropic gave up its contract with the Pentagon over AI safety disagreements -- then, OpenAI swooped in.

TechCrunch - AI · 5 min ·
Nlp

Using AI With Deep Knowledge From 37 Academic Books Using Graph RAG to Make 9, Well-Informed Predictions About Our Future. The Analysis is...Bleak.

I'm using this specialized canvas app that lets me build the neurological brain of a chatbot based on connected notes. I added and connec...

Reddit - Artificial Intelligence · 1 min ·
Anthropic’s Break With the Pentagon Ignites AI Ethics Debate
Ai Safety

Anthropic’s Break With the Pentagon Ignites AI Ethics Debate

AI Tools & Products · 12 min ·
[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning
Llms

[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning

Abstract page for arXiv paper 2602.04288: Contextual Drag: How Errors in the Context Affect LLM Reasoning

arXiv - Machine Learning · 3 min ·
[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation
Llms

[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

Abstract page for arXiv paper 2511.12832: From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

arXiv - AI · 3 min ·
[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences
Llms

[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

Abstract page for arXiv paper 2510.13900: Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

arXiv - AI · 4 min ·
[2512.05116] Value Gradient Guidance for Flow Matching Alignment
Machine Learning

[2512.05116] Value Gradient Guidance for Flow Matching Alignment

Abstract page for arXiv paper 2512.05116: Value Gradient Guidance for Flow Matching Alignment

arXiv - Machine Learning · 3 min ·
[2507.01352] Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
Machine Learning

[2507.01352] Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Abstract page for arXiv paper 2507.01352: Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

arXiv - Machine Learning · 4 min ·
[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
Llms

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Abstract page for arXiv paper 2506.17871: LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

arXiv - Machine Learning · 4 min ·
[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy
Llms

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Abstract page for arXiv paper 2510.08646: Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

arXiv - Machine Learning · 4 min ·
[2509.23265] CREPE: Controlling Diffusion with Replica Exchange
Machine Learning

[2509.23265] CREPE: Controlling Diffusion with Replica Exchange

Abstract page for arXiv paper 2509.23265: CREPE: Controlling Diffusion with Replica Exchange

arXiv - Machine Learning · 3 min ·
[2506.01153] Weight-Space Linear Recurrent Neural Networks
Machine Learning

[2506.01153] Weight-Space Linear Recurrent Neural Networks

Abstract page for arXiv paper 2506.01153: Weight-Space Linear Recurrent Neural Networks

arXiv - Machine Learning · 4 min ·
[2505.18996] Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs
Machine Learning

[2505.18996] Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Abstract page for arXiv paper 2505.18996: Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

arXiv - Machine Learning · 3 min ·
[2505.12506] Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective
Ai Safety

[2505.12506] Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

Abstract page for arXiv paper 2505.12506: Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

arXiv - AI · 3 min ·
[2505.00940] StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data
Ai Safety

[2505.00940] StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data

Abstract page for arXiv paper 2505.00940: StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data

arXiv - Machine Learning · 4 min ·
[2603.03281] CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance
Machine Learning

[2603.03281] CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

Abstract page for arXiv paper 2603.03281: CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

arXiv - Machine Learning · 4 min ·
[2603.02984] Variance reduction in lattice QCD observables via normalizing flows
Ai Safety

[2603.02984] Variance reduction in lattice QCD observables via normalizing flows

Abstract page for arXiv paper 2603.02984: Variance reduction in lattice QCD observables via normalizing flows

arXiv - Machine Learning · 3 min ·
[2603.02937] Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection
Ai Safety

[2603.02937] Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

Abstract page for arXiv paper 2603.02937: Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

arXiv - Machine Learning · 4 min ·
[2603.03094] Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation
Ai Safety

[2603.03094] Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation

Abstract page for arXiv paper 2603.03094: Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation

arXiv - AI · 3 min ·
[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need
Machine Learning

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Abstract page for arXiv paper 2603.02639: Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size...

arXiv - Machine Learning · 3 min ·
Previous Page 20 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime