AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

Abstract page for arXiv paper 2603.14267: DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and ...

arXiv - AI · 4 min · about 2 hours ago

Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min · about 2 hours ago

Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min · about 2 hours ago

All Content

Ai Safety

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch

Anthropic gave up its contract with the Pentagon over AI safety disagreements -- then, OpenAI swooped in.

TechCrunch - AI · 5 min · 25 days ago

Nlp

Using AI With Deep Knowledge From 37 Academic Books Using Graph RAG to Make 9, Well-Informed Predictions About Our Future. The Analysis is...Bleak.

I'm using this specialized canvas app that lets me build the neurological brain of a chatbot based on connected notes. I added and connec...

Reddit - Artificial Intelligence · 1 min · 26 days ago

Ai Safety

Anthropic’s Break With the Pentagon Ignites AI Ethics Debate

AI Tools & Products · 12 min · 26 days ago

Llms

[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning

Abstract page for arXiv paper 2602.04288: Contextual Drag: How Errors in the Context Affect LLM Reasoning

arXiv - Machine Learning · 3 min · 26 days ago

Llms

[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

Abstract page for arXiv paper 2511.12832: From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

arXiv - AI · 3 min · 26 days ago

Llms

[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

Abstract page for arXiv paper 2510.13900: Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2512.05116] Value Gradient Guidance for Flow Matching Alignment

Abstract page for arXiv paper 2512.05116: Value Gradient Guidance for Flow Matching Alignment

arXiv - Machine Learning · 3 min · 26 days ago

Machine Learning

[2507.01352] Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Abstract page for arXiv paper 2507.01352: Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

arXiv - Machine Learning · 4 min · 26 days ago

Llms

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

Abstract page for arXiv paper 2506.17871: LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

arXiv - Machine Learning · 4 min · 26 days ago

Llms

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

Abstract page for arXiv paper 2510.08646: Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2509.23265] CREPE: Controlling Diffusion with Replica Exchange

Abstract page for arXiv paper 2509.23265: CREPE: Controlling Diffusion with Replica Exchange

arXiv - Machine Learning · 3 min · 26 days ago

Machine Learning

[2506.01153] Weight-Space Linear Recurrent Neural Networks

Abstract page for arXiv paper 2506.01153: Weight-Space Linear Recurrent Neural Networks

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2505.18996] Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

Abstract page for arXiv paper 2505.18996: Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

arXiv - Machine Learning · 3 min · 26 days ago

Ai Safety

[2505.12506] Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

Abstract page for arXiv paper 2505.12506: Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

arXiv - AI · 3 min · 26 days ago

Ai Safety

[2505.00940] StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data

Abstract page for arXiv paper 2505.00940: StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2603.03281] CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

Abstract page for arXiv paper 2603.03281: CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

arXiv - Machine Learning · 4 min · 26 days ago

Ai Safety

[2603.02984] Variance reduction in lattice QCD observables via normalizing flows

Abstract page for arXiv paper 2603.02984: Variance reduction in lattice QCD observables via normalizing flows

arXiv - Machine Learning · 3 min · 26 days ago

Ai Safety

[2603.02937] Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

Abstract page for arXiv paper 2603.02937: Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

arXiv - Machine Learning · 4 min · 26 days ago

Ai Safety

[2603.03094] Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation

Abstract page for arXiv paper 2603.03094: Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation

arXiv - AI · 3 min · 26 days ago

Machine Learning

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Abstract page for arXiv paper 2603.02639: Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size...

arXiv - Machine Learning · 3 min · 26 days ago

Previous Page 20 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2603.14267] DiFlowDubber: Discrete Flow Matching for Automated Video Dubbing via Cross-Modal Alignment and Synchronization

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

All Content

Anthropic CEO Dario Amodei calls OpenAI's messaging around military deal 'straight up lies,' report says | TechCrunch

Using AI With Deep Knowledge From 37 Academic Books Using Graph RAG to Make 9, Well-Informed Predictions About Our Future. The Analysis is...Bleak.

Anthropic’s Break With the Pentagon Ignites AI Ethics Debate

[2602.04288] Contextual Drag: How Errors in the Context Affect LLM Reasoning

[2511.12832] From Passive to Persuasive: Steering Emotional Nuance in Human-AI Negotiation

[2510.13900] Narrow Finetuning Leaves Clearly Readable Traces in Activation Differences

[2512.05116] Value Gradient Guidance for Flow Matching Alignment

[2507.01352] Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon

[2510.08646] Mitigating Over-Refusal in Aligned Large Language Models via Inference-Time Activation Energy

[2509.23265] CREPE: Controlling Diffusion with Replica Exchange

[2506.01153] Weight-Space Linear Recurrent Neural Networks

[2505.18996] Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs

[2505.12506] Unsupervised Representation Learning -- an Invariant Risk Minimization Perspective

[2505.00940] StablePCA: Distributionally Robust Learning of Representations from Multi-Source Data

[2603.03281] CFG-Ctrl: Control-Based Classifier-Free Diffusion Guidance

[2603.02984] Variance reduction in lattice QCD observables via normalizing flows

[2603.02937] Bias and Fairness in Self-Supervised Acoustic Representations for Cognitive Impairment Detection

[2603.03094] Proactive Guiding Strategy for Item-side Fairness in Interactive Recommendation

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Related Topics

Stay updated with AI News