AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment
Machine Learning

[2511.21331] The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

Abstract page for arXiv paper 2511.21331: The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment

arXiv - AI · 4 min ·
[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?
Llms

[2509.22367] What Is The Political Content in LLMs' Pre- and Post-Training Data?

Abstract page for arXiv paper 2509.22367: What Is The Political Content in LLMs' Pre- and Post-Training Data?

arXiv - AI · 4 min ·
[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees
Machine Learning

[2507.22264] SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Abstract page for arXiv paper 2507.22264: SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

arXiv - AI · 4 min ·

All Content

Llms

Super Intelligence is a Lie

The article argues that while superintelligence may not be a total myth, current AI technologies are far from achieving it, questioning t...

Reddit - Artificial Intelligence · 1 min ·
How the FCC stopped CBS and Stephen Colbert | The Verge
Ai Safety

How the FCC stopped CBS and Stephen Colbert | The Verge

The Vergecast discusses the FCC's chilling impact on late-night TV, focusing on CBS's decision to prevent Stephen Colbert from airing an ...

The Verge - AI · 5 min ·
Money no longer matters to AI’s top talent | The Verge
Ai Startups

Money no longer matters to AI’s top talent | The Verge

The article discusses the current dynamics in the AI job market, highlighting how top talent is increasingly motivated by ideological mis...

The Verge - AI · 5 min ·
Machine Learning

[P] SoftDTW-CUDA for PyTorch package: fast + memory-efficient Soft Dynamic Time Warping with CUDA support

The SoftDTW-CUDA for PyTorch package offers a fast and memory-efficient implementation of Soft Dynamic Time Warping, optimized for GPU us...

Reddit - Machine Learning · 1 min ·
Altman and Amodei share a moment of awkwardness at India's big AI summit | TechCrunch
Ai Startups

Altman and Amodei share a moment of awkwardness at India's big AI summit | TechCrunch

At the India AI Impact Summit, OpenAI's Sam Altman and Anthropic's Dario Amodei notably refrained from joining a unity gesture led by Pri...

TechCrunch - AI · 4 min ·
For open-source programs, AI coding tools are a mixed blessing | TechCrunch
Open Source Ai

For open-source programs, AI coding tools are a mixed blessing | TechCrunch

The article discusses the dual impact of AI coding tools on open-source software, highlighting both the ease of feature development and t...

TechCrunch - AI · 7 min ·
Ai Agents

Open-source benchmark EVMbench tests how well AI agents handle smart contract exploits

EVMbench is an open-source benchmark developed by OpenAI and Paradigm to evaluate AI agents' capabilities in handling smart contract secu...

Reddit - Artificial Intelligence · 1 min ·
The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review
Robotics

The Download: Autonomous narco submarines, and virtue signaling chatbots | MIT Technology Review

This edition of The Download covers advancements in autonomous narco submarines, ethical concerns surrounding AI chatbots, and the evolvi...

MIT Technology Review · 7 min ·
[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources
Machine Learning

[2602.10531] From Collapse to Improvement: Statistical Perspectives on the Evolutionary Dynamics of Iterative Training on Contaminated Sources

This paper explores the dynamics of iterative training on contaminated data sources, demonstrating that model performance can improve des...

arXiv - Machine Learning · 4 min ·
[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs
Llms

[2512.03310] Randomized Masked Finetuning: An Efficient Way to Mitigate Memorization of PIIs in LLMs

The paper introduces Randomized Masked Finetuning (RMFT), a technique designed to reduce the memorization of personally identifiable info...

arXiv - Machine Learning · 3 min ·
[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment
Machine Learning

[2602.12281] Scaling Verification Can Be More Effective than Scaling Policy Learning for Vision-Language-Action Alignment

This paper explores the effectiveness of test-time verification over policy learning in enhancing Vision-Language-Action (VLA) alignment,...

arXiv - AI · 4 min ·
[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning
Llms

[2602.07680] Vision and Language: Novel Representations and Artificial intelligence for Driving Scene Safety Assessment and Autonomous Vehicle Planning

This paper explores the integration of vision-language models in autonomous driving, focusing on safety assessment and decision-making th...

arXiv - Machine Learning · 4 min ·
[2602.05023] Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?
Llms

[2602.05023] Do Vision-Language Models Respect Contextual Integrity in Location Disclosure?

This article examines whether vision-language models (VLMs) respect contextual integrity when disclosing location information, highlighti...

arXiv - AI · 4 min ·
[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves
Machine Learning

[2412.10537] VerifiableFL: Verifiable Claims for Federated Learning using Exclaves

The paper presents VerifiableFL, a system for federated learning that ensures verifiable claims about model training using exclaves, enha...

arXiv - Machine Learning · 4 min ·
[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments
Machine Learning

[2512.00036] Refined Bayesian Optimization for Efficient Beam Alignment in Intelligent Indoor Wireless Environments

This article presents a refined Bayesian optimization framework for efficient beam alignment in intelligent indoor wireless environments,...

arXiv - AI · 4 min ·
[2511.04694] Reasoning Up the Instruction Ladder for Controllable Language Models
Llms

[2511.04694] Reasoning Up the Instruction Ladder for Controllable Language Models

This paper explores the importance of instruction hierarchy in large language models (LLMs) for enhancing their controllability and relia...

arXiv - AI · 4 min ·
[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink
Machine Learning

[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink

The paper explores the challenges of spatio-temporal models in machine learning, focusing on biases in temporal attention mechanisms and ...

arXiv - Machine Learning · 3 min ·
[2602.10067] Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability
Llms

[2602.10067] Features as Rewards: Scalable Supervision for Open-Ended Tasks via Interpretability

The paper introduces a novel approach to using features as rewards in reinforcement learning for open-ended tasks, focusing on reducing h...

arXiv - Machine Learning · 4 min ·
[2602.05139] Adaptive Exploration for Latent-State Bandits
Machine Learning

[2602.05139] Adaptive Exploration for Latent-State Bandits

The paper presents adaptive exploration strategies for latent-state bandits, addressing challenges in reward estimation and action select...

arXiv - Machine Learning · 3 min ·
[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies
Llms

[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies

The article presents PolicyPad, an interactive system designed for collaborative prototyping of policies governing large language models ...

arXiv - AI · 3 min ·
Previous Page 84 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime