AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

Top This Week

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min ·
Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min ·
Ai Safety

Newsom signs executive order requiring AI companies to have safety, privacy guardrails

submitted by /u/Fcking_Chuck [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning
Machine Learning

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

The paper presents Clustered Quantum Secure Aggregation (CQSA), a novel framework for Byzantine-robust secure aggregation in federated le...

arXiv - Machine Learning · 4 min ·
[2602.22266] WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention
Machine Learning

[2602.22266] WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

The paper introduces WaveSSM, a novel multiscale state-space model designed to enhance the modeling of non-stationary signals, outperform...

arXiv - Machine Learning · 3 min ·
[2602.22261] Sustainable LLM Inference using Context-Aware Model Switching
Llms

[2602.22261] Sustainable LLM Inference using Context-Aware Model Switching

The paper presents a context-aware model switching approach for large language models (LLMs) to enhance energy efficiency during inferenc...

arXiv - Machine Learning · 4 min ·
[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning
Llms

[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

The paper introduces AOT-SFT, an adversarial dataset aimed at enhancing the robustness of Multimodal Large Language Models (MLLMs) agains...

arXiv - AI · 3 min ·
The Pentagon’s battle with Anthropic is really a war over who controls AI
Ai Safety

The Pentagon’s battle with Anthropic is really a war over who controls AI

The Pentagon's ultimatum to Anthropic over AI control raises critical questions about military access to advanced technologies and the et...

AI Tools & Products · 11 min ·
AI: Catalyst or Threat to Human Innovation?
Ai Safety

AI: Catalyst or Threat to Human Innovation?

The article explores the role of collaboration and expertise in technological advancement, contrasting it with the limitations of individ...

AI News - General · 9 min ·
Anthropic v the US military: what this public feud says about the use of AI in warfare
Ai Safety

Anthropic v the US military: what this public feud says about the use of AI in warfare

The article explores the conflict between Anthropic and the US military over the use of AI in warfare, highlighting ethical concerns and ...

AI Tools & Products · 6 min ·
Ai Safety

Anthropic Rejects Latest Pentagon Offer, Escalating AI Feud

Anthropic has rejected the Pentagon's latest offer, intensifying the ongoing conflict over AI regulations and military applications.

Reddit - Artificial Intelligence · 1 min ·
Ai Safety

[D] Waiting for PhD thesis examination results is affecting my mental health

The author shares their struggles with mental health while awaiting PhD thesis examination results, highlighting anxiety and the impact o...

Reddit - Machine Learning · 1 min ·
‘Uncanny Valley’: Pentagon vs. ‘Woke’ Anthropic, Agentic vs. Mimetic, and Trump vs. State of the Union | WIRED
Ai Agents

‘Uncanny Valley’: Pentagon vs. ‘Woke’ Anthropic, Agentic vs. Mimetic, and Trump vs. State of the Union | WIRED

The Uncanny Valley podcast discusses the escalating feud between Anthropic and the Pentagon over AI technology use, the concept of agenti...

Wired - AI · 32 min ·
Ai Safety

Anthropic rejects latest Pentagon offer: ‘We cannot in good conscience accede to their request’

Anthropic has declined the Pentagon's latest offer, citing ethical concerns about aligning with military interests in AI development.

Reddit - Artificial Intelligence · 1 min ·
Anthropic CEO stands firm as Pentagon deadline looms | TechCrunch
Ai Safety

Anthropic CEO stands firm as Pentagon deadline looms | TechCrunch

Anthropic CEO Dario Amodei refuses Pentagon demands for unrestricted military access to AI systems, citing concerns over democratic value...

TechCrunch - AI · 4 min ·
AI In schools risks widening divides, private schools warn
Ai Infrastructure

AI In schools risks widening divides, private schools warn

Australia's private schools urge the government to implement a national AI pilot program to prevent widening educational divides and enha...

AI News - General · 4 min ·
Anthropic refuses Pentagon’s new terms, standing firm on lethal autonomous weapons and mass surveillance | The Verge
Robotics

Anthropic refuses Pentagon’s new terms, standing firm on lethal autonomous weapons and mass surveillance | The Verge

Anthropic has rejected the Pentagon's ultimatum for unrestricted access to its AI, maintaining its stance against lethal autonomous weapo...

The Verge - AI · 5 min ·
This AI Agent Is Designed to Not Go Rogue | WIRED
Ai Agents

This AI Agent Is Designed to Not Go Rogue | WIRED

IronCurtain is an open-source AI assistant designed to enhance security and control over AI agents, preventing them from executing harmfu...

Wired - AI · 7 min ·
Llms

Risk of new uncensored models

The article discusses the tradeoff between freedom of information and safety in the context of uncensored AI models, highlighting their p...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Invisible characters hidden in text can trick AI agents into following secret instructions — we tested 5 models across 8,000+ cases

The article explores how invisible Unicode characters can manipulate AI models into following hidden instructions, revealing vulnerabilit...

Reddit - Artificial Intelligence · 1 min ·
Ai Safety

Anthropic’s Pentagon Showdown Is About More Than AI Guardrails. The high-stakes conflict between the Defense Department and a $380 billion tech powerhouse goes to the heart of just how far AI can go in warfare.

The article discusses the ongoing conflict between the Pentagon and Anthropic, a leading AI company, highlighting the implications for AI...

Reddit - Artificial Intelligence · 1 min ·
How Chinese AI Chatbots Censor Themselves | WIRED
Machine Learning

How Chinese AI Chatbots Censor Themselves | WIRED

A study by Stanford and Princeton reveals that Chinese AI chatbots are more likely to censor political questions than their Western count...

Wired - AI · 9 min ·
Machine Learning

[D] First time reviewer. I got assigned 9 papers. I'm so nervous. What if I mess up. Any advice?

A first-time reviewer expresses anxiety about handling nine assigned papers, seeking advice on acceptable practices, quality concerns, an...

Reddit - Machine Learning · 1 min ·
Previous Page 41 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime