AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

RSS

Top This Week

Ai Safety

NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much

submitted by /u/esporx [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 19 hours ago

Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min · about 22 hours ago

Computer Vision

House Democrat Questions Anthropic on AI Safety After Source Code Leak

Rep. Josh Gottheimer, who is generally tough on China, just sent a letter to Anthropic questioning their decision to reduce certain safet...

Reddit - Artificial Intelligence · 1 min · about 22 hours ago

All Content

Ai Safety

8 ways new AI rules will change how you bank in the UAE

The article outlines new AI regulations by the Central Bank of the UAE aimed at ensuring consumer protection, transparency, and responsib...

AI News - General · 4 min · about 1 month ago

Llms

[D] New Research Discord - Computational Psycholinguistics

A Reddit user is forming a research-focused Discord community for those interested in computational psycholinguistics, aiming to facilita...

Reddit - Machine Learning · 1 min · about 1 month ago

Ai Startups

With AI, investor loyalty is (almost) dead: At least a dozen OpenAI VCs now also back Anthropic | TechCrunch

The article discusses the decline of investor loyalty in the AI sector, highlighting how several VCs are backing both OpenAI and Anthropi...

TechCrunch - AI · 6 min · about 1 month ago

Machine Learning

The left is missing out on AI | Transformer News

The article discusses how the political left has largely overlooked the implications of artificial intelligence, despite its societal sig...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Infrastructure

AI has used a total of 670 billion litres of water

The article discusses the staggering water consumption of AI, totaling 670 billion liters, which surpasses the volume of Sydney Harbour, ...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

[R] Concept Influence: Training Data Attribution via Interpretability (Same performance and 20× faster than influence functions)

The article discusses a novel approach to training data attribution in machine learning, utilizing interpretable vectors for faster and m...

Reddit - Machine Learning · 1 min · about 1 month ago

Ai Agents

Uncanny Valley: AI Researchers’ Resignations, Bots Hiring Humans, Evie Magazine’s Party | WIRED

This episode of 'Uncanny Valley' discusses AI researchers resigning over safety concerns, the controversial Rent-A-Human service hiring h...

Wired - AI · 31 min · about 1 month ago

Llms

Anthropic accuses DeepSeek and other Chinese firms of using Claude to train their AI | The Verge

Anthropic accuses DeepSeek and other Chinese firms of misusing its Claude AI model to enhance their own products through illicit distilla...

The Verge - AI · 4 min · about 1 month ago

Llms

Claude Code creator says these are the 3 principles he shares with every member of his team

The creator of Claude Code outlines three key principles that guide his team, emphasizing the importance of collaboration, innovation, an...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Llms

Guide Labs debuts a new kind of interpretable LLM | TechCrunch

Guide Labs introduces Steerling-8B, an open-sourced interpretable LLM designed to enhance understanding of AI model outputs by tracing to...

TechCrunch - AI · 6 min · about 1 month ago

Llms

[D] Is the move toward Energy-Based Models for reasoning a viable exit from the "hallucination" trap of LLMs?

The article discusses the debate between Yann LeCun and Demis Hassabis regarding the limitations of large language models (LLMs) and the ...

Reddit - Machine Learning · 1 min · about 1 month ago

Robotics

The human work behind humanoid robots is being hidden | MIT Technology Review

The article discusses the hidden human labor behind humanoid robots, highlighting how this lack of transparency leads to misconceptions a...

MIT Technology Review - AI · 5 min · about 1 month ago

Ai Safety

If Big Tech cared about fighting AI slop, we wouldn’t be drowning in it | The Verge

The Verge critiques Big Tech's inadequate efforts in combating AI-generated misinformation, highlighting the shortcomings of the C2PA sys...

The Verge - AI · 13 min · about 1 month ago

Llms

Defense Secretary summons Anthropic’s Amodei over military use of Claude | TechCrunch

Defense Secretary Pete Hegseth has summoned Anthropic CEO Dario Amodei to discuss the military's use of Claude, amid threats of designati...

TechCrunch - AI · 3 min · about 1 month ago

Ai Agents

How AI agents could destroy the economy | TechCrunch

Citrini Research's report envisions a future where AI agents lead to mass unemployment and significant economic decline, highlighting a n...

TechCrunch - AI · 3 min · about 1 month ago

Ai Safety

The Download: Chicago's surveillance network, and building better bras | MIT Technology Review

This edition of The Download covers Chicago's extensive surveillance network and the evolving field of breast biomechanics, highlighting ...

MIT Technology Review - AI · 6 min · about 1 month ago

Ai Safety

UAE Central Bank issues new rules on AI use to protect banking customers

The UAE Central Bank has introduced new guidelines to ensure the responsible use of AI in the financial sector, enhancing consumer protec...

AI News - General · 4 min · about 1 month ago

Nlp

Inside Chicago’s surveillance panopticon | MIT Technology Review

The article explores Chicago's extensive surveillance system, highlighting its implications for public safety and civil liberties, partic...

MIT Technology Review - AI · 20 min · about 1 month ago

Llms

AI Agent Security Without Content Filtering, A Different Architecture

The article discusses Sentinel Gateway, a middleware platform designed to enhance AI agent security by cryptographically separating instr...

Reddit - Artificial Intelligence · 1 min · about 1 month ago

Ai Safety

Listen: Are we serious about regulating AI?

The article discusses the challenges of regulating AI, focusing on the EU's efforts and the limitations of self-regulation in addressing ...

AI Tools & Products · 4 min · about 1 month ago

Previous Page 69 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Safety & Ethics

Top This Week

NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

House Democrat Questions Anthropic on AI Safety After Source Code Leak

All Content

8 ways new AI rules will change how you bank in the UAE

[D] New Research Discord - Computational Psycholinguistics

With AI, investor loyalty is (almost) dead: At least a dozen OpenAI VCs now also back Anthropic | TechCrunch

The left is missing out on AI | Transformer News

AI has used a total of 670 billion litres of water

[R] Concept Influence: Training Data Attribution via Interpretability (Same performance and 20× faster than influence functions)

Uncanny Valley: AI Researchers’ Resignations, Bots Hiring Humans, Evie Magazine’s Party | WIRED

Anthropic accuses DeepSeek and other Chinese firms of using Claude to train their AI | The Verge

Claude Code creator says these are the 3 principles he shares with every member of his team

Guide Labs debuts a new kind of interpretable LLM | TechCrunch

[D] Is the move toward Energy-Based Models for reasoning a viable exit from the "hallucination" trap of LLMs?

The human work behind humanoid robots is being hidden | MIT Technology Review

If Big Tech cared about fighting AI slop, we wouldn’t be drowning in it | The Verge

Defense Secretary summons Anthropic’s Amodei over military use of Claude | TechCrunch

How AI agents could destroy the economy | TechCrunch

The Download: Chicago's surveillance network, and building better bras | MIT Technology Review

UAE Central Bank issues new rules on AI use to protect banking customers

Inside Chicago’s surveillance panopticon | MIT Technology Review

AI Agent Security Without Content Filtering, A Different Architecture

Listen: Are we serious about regulating AI?

Related Topics

Stay updated with AI News