AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

Abstract page for arXiv paper 2511.21331: The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

arXiv - AI · 4 min · about 5 hours ago

Llms

[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?

Abstract page for arXiv paper 2509.22367: What Is The Political Content in LLMs' Pre- and Post-Training Data?

arXiv - AI · 4 min · about 5 hours ago

Machine Learning

[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Abstract page for arXiv paper 2507.22264: SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

arXiv - AI · 4 min · about 5 hours ago

All Content

Llms

Super Intelligence is a Lie

The article argues that while superintelligence may not be a total myth, current AI technologies are far from achieving it, questioning t...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Ai Safety

How the FCC stopped CBS and Stephen Colbert | The Verge

The Vergecast discusses the FCC's chilling impact on late-night TV, focusing on CBS's decision to prevent Stephen Colbert from airing an ...

The Verge - AI · 5 min · about 2 months ago

Ai Startups

Money no longer matters to AI’s top talent | The Verge

The article discusses the current dynamics in the AI job market, highlighting how top talent is increasingly motivated by ideological mis...

The Verge - AI · 5 min · about 2 months ago

Machine Learning

[P] SoftDTW-CUDA for PyTorch package: fast + memory-efficient Soft Dynamic Time Warping with CUDA support

The SoftDTW-CUDA for PyTorch package offers a fast and memory-efficient implementation of Soft Dynamic Time Warping, optimized for GPU us...

Reddit - Machine Learning · 1 min · about 2 months ago

Ai Startups

Altman and Amodei share a moment of awkwardness at India's big AI summit | TechCrunch

At the India AI Impact Summit, OpenAI's Sam Altman and Anthropic's Dario Amodei notably refrained from joining a unity gesture led by Pri...

TechCrunch - AI · 4 min · about 2 months ago

Open Source Ai

For open-source programs, AI coding tools are a mixed blessing | TechCrunch

The article discusses the dual impact of AI coding tools on open-source software, highlighting both the ease of feature development and t...

TechCrunch - AI · 7 min · about 2 months ago

Ai Agents

Open-source benchmark EVMbench tests how well AI agents handle smart contract exploits

EVMbench is an open-source benchmark developed by OpenAI and Paradigm to evaluate AI agents' capabilities in handling smart contract secu...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Robotics

The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review

This edition of The Download covers advancements in autonomous narco submarines, ethical concerns surrounding AI chatbots, and the evolvi...

MIT Technology Review · 7 min · about 2 months ago

Machine Learning

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

This paper explores the dynamics of iterative training on contaminated data sources, demonstrating that model performance can improve des...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

The paper introduces Randomized Masked Finetuning (RMFT), a technique designed to reduce the memorization of personally identifiable info...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

This paper explores the effectiveness of test-time verification over policy learning in enhancing Vision-Language-Action (VLA) alignment,...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning

This paper explores the integration of vision-language models in autonomous driving, focusing on safety assessment and decision-making th...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.05023] Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?

This article examines whether vision-language models (VLMs) respect contextual integrity when disclosing location information, highlighti...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves

The paper presents VerifiableFL, a system for federated learning that ensures verifiable claims about model training using exclaves, enha...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments

This article presents a refined Bayesian optimization framework for efficient beam alignment in intelligent indoor wireless environments,...

arXiv - AI · 4 min · about 2 months ago

Llms

[2511.04694] Reasoning Up the Instruction Ladder for Controllable Language Models

This paper explores the importance of instruction hierarchy in large language models (LLMs) for enhancing their controllability and relia...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink

The paper explores the challenges of spatio-temporal models in machine learning, focusing on biases in temporal attention mechanisms and ...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.10067] Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability

The paper introduces a novel approach to using features as rewards in reinforcement learning for open-ended tasks, focusing on reducing h...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.05139] Adaptive Exploration for Latent-State Bandits

The paper presents adaptive exploration strategies for latent-state bandits, addressing challenges in reward estimation and action select...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies

The article presents PolicyPad, an interactive system designed for collaborative prototyping of policies governing large language models ...

arXiv - AI · 3 min · about 2 months ago

Previous Page 84 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?

[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

All Content

Super Intelligence is a Lie

How the FCC stopped CBS and Stephen Colbert | The Verge

Money no longer matters to AI’s top talent | The Verge

[P] SoftDTW-CUDA for PyTorch package: fast + memory-efficient Soft Dynamic Time Warping with CUDA support

Altman and Amodei share a moment of awkwardness at India's big AI summit | TechCrunch

For open-source programs, AI coding tools are a mixed blessing | TechCrunch

Open-source benchmark EVMbench tests how well AI agents handle smart contract exploits

The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning

[2602.05023] Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?

[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves

[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments

[2511.04694] Reasoning Up the Instruction Ladder for Controllable Language Models

[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink

[2602.10067] Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability

[2602.05139] Adaptive Exploration for Latent-State Bandits

[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies

Related Topics

Stay updated with AI News