AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[D] I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Machine Learning · 1 min · about 4 hours ago

Machine Learning

I had an idea, would love your thoughts

What happens that while training an AI during pre training we make it such that if makes "misaligned behaviour" then we just reduce like ...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Ai Safety

Newsom signs executive order requiring AI companies to have safety, privacy guardrails

submitted by /u/Fcking_Chuck [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

All Content

Machine Learning

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

The paper presents Clustered Quantum Secure Aggregation (CQSA), a novel framework for Byzantine-robust secure aggregation in federated le...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22266] WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

The paper introduces WaveSSM, a novel multiscale state-space model designed to enhance the modeling of non-stationary signals, outperform...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22261] Sustainable LLM Inference using Context-Aware Model Switching

The paper presents a context-aware model switching approach for large language models (LLMs) to enhance energy efficiency during inferenc...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

The paper introduces AOT-SFT, an adversarial dataset aimed at enhancing the robustness of Multimodal Large Language Models (MLLMs) agains...

arXiv - AI · 3 min · about 1 month ago

Ai Safety

The Pentagon’s battle with Anthropic is really a war over who controls AI

The Pentagon's ultimatum to Anthropic over AI control raises critical questions about military access to advanced technologies and the et...

AI Tools & Products · 11 min · about 1 month ago

Ai Safety

AI: Catalyst or Threat to Human Innovation?

The article explores the role of collaboration and expertise in technological advancement, contrasting it with the limitations of individ...

AI News - General · 9 min · about 1 month ago

Ai Safety

Anthropic v the US military: what this public feud says about the use of AI in warfare

The article explores the conflict between Anthropic and the US military over the use of AI in warfare, highlighting ethical concerns and ...

AI Tools & Products · 6 min · about 1 month ago

Ai Safety

Anthropic Rejects Latest Pentagon Offer, Escalating AI Feud

Anthropic has rejected the Pentagon's latest offer, intensifying the ongoing conflict over AI regulations and military applications.

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Safety

[D] Waiting for PhD thesis examination results is affecting my mental health

The author shares their struggles with mental health while awaiting PhD thesis examination results, highlighting anxiety and the impact o...

Reddit - Machine Learning · 1 min · about 1 month ago

Ai Agents

‘Uncanny Valley’: Pentagon vs. ‘Woke’ Anthropic, Agentic vs. Mimetic, and Trump vs. State of the Union | WIRED

The Uncanny Valley podcast discusses the escalating feud between Anthropic and the Pentagon over AI technology use, the concept of agenti...

Wired - AI · 32 min · about 1 month ago

Ai Safety

Anthropic rejects latest Pentagon offer: ‘We cannot in good conscience accede to their request’

Anthropic has declined the Pentagon's latest offer, citing ethical concerns about aligning with military interests in AI development.

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Safety

Anthropic CEO stands firm as Pentagon deadline looms | TechCrunch

Anthropic CEO Dario Amodei refuses Pentagon demands for unrestricted military access to AI systems, citing concerns over democratic value...

TechCrunch - AI · 4 min · about 1 month ago

Ai Infrastructure

AI In schools risks widening divides, private schools warn

Australia's private schools urge the government to implement a national AI pilot program to prevent widening educational divides and enha...

AI News - General · 4 min · about 1 month ago

Robotics

Anthropic refuses Pentagon’s new terms, standing firm on lethal autonomous weapons and mass surveillance | The Verge

Anthropic has rejected the Pentagon's ultimatum for unrestricted access to its AI, maintaining its stance against lethal autonomous weapo...

The Verge - AI · 5 min · about 1 month ago

Ai Agents

This AI Agent Is Designed to Not Go Rogue | WIRED

IronCurtain is an open-source AI assistant designed to enhance security and control over AI agents, preventing them from executing harmfu...

Wired - AI · 7 min · about 1 month ago

Llms

Risk of new uncensored models

The article discusses the tradeoff between freedom of information and safety in the context of uncensored AI models, highlighting their p...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Machine Learning

Invisible characters hidden in text can trick AI agents into following secret instructions — we tested 5 models across 8,000+ cases

The article explores how invisible Unicode characters can manipulate AI models into following hidden instructions, revealing vulnerabilit...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Safety

Anthropic’s Pentagon Showdown Is About More Than AI Guardrails. The high-stakes conflict between the Defense Department and a $380 billion tech powerhouse goes to the heart of just how far AI can go in warfare.

The article discusses the ongoing conflict between the Pentagon and Anthropic, a leading AI company, highlighting the implications for AI...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Machine Learning

How Chinese AI Chatbots Censor Themselves | WIRED

A study by Stanford and Princeton reveals that Chinese AI chatbots are more likely to censor political questions than their Western count...

Wired - AI · 9 min · about 1 month ago

Machine Learning

[D] First time reviewer. I got assigned 9 papers. I'm so nervous. What if I mess up. Any advice?

A first-time reviewer expresses anxiety about handling nine assigned papers, seeking advice on acceptable practices, quality concerns, an...

Reddit - Machine Learning · 1 min · about 1 month ago

Previous Page 41 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

[D] I had an idea, would love your thoughts

I had an idea, would love your thoughts

Newsom signs executive order requiring AI companies to have safety, privacy guardrails

All Content

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

[2602.22266] WaveSSM: Multiscale State-Space Models for Non-stationary Signal Attention

[2602.22261] Sustainable LLM Inference using Context-Aware Model Switching

[2602.22227] To Deceive is to Teach? Forging Perceptual Robustness via Adversarial Reinforcement Learning

The Pentagon’s battle with Anthropic is really a war over who controls AI

AI: Catalyst or Threat to Human Innovation?

Anthropic v the US military: what this public feud says about the use of AI in warfare

Anthropic Rejects Latest Pentagon Offer, Escalating AI Feud

[D] Waiting for PhD thesis examination results is affecting my mental health

‘Uncanny Valley’: Pentagon vs. ‘Woke’ Anthropic, Agentic vs. Mimetic, and Trump vs. State of the Union | WIRED

Anthropic rejects latest Pentagon offer: ‘We cannot in good conscience accede to their request’

Anthropic CEO stands firm as Pentagon deadline looms | TechCrunch

AI In schools risks widening divides, private schools warn

Anthropic refuses Pentagon’s new terms, standing firm on lethal autonomous weapons and mass surveillance | The Verge

This AI Agent Is Designed to Not Go Rogue | WIRED

Risk of new uncensored models

Invisible characters hidden in text can trick AI agents into following secret instructions — we tested 5 models across 8,000+ cases

Anthropic’s Pentagon Showdown Is About More Than AI Guardrails. The high-stakes conflict between the Defense Department and a $380 billion tech powerhouse goes to the heart of just how far AI can go in warfare.

How Chinese AI Chatbots Censor Themselves | WIRED

[D] First time reviewer. I got assigned 9 papers. I'm so nervous. What if I mess up. Any advice?

Related Topics

Stay updated with AI News