NHS staff resist using Palantir software. Staff reportedly cite ethics concerns, privacy worries, and doubt the platform adds much
submitted by /u/esporx [link] [comments]
Alignment, bias, regulation, and responsible AI
submitted by /u/esporx [link] [comments]
RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...
Rep. Josh Gottheimer, who is generally tough on China, just sent a letter to Anthropic questioning their decision to reduce certain safet...
This paper explores the memorization phenomena in diffusion models, introducing a geometric framework that identifies risk levels across ...
This paper presents a novel framework for multi-material, multi-physics topology optimization using physics-informed Gaussian processes, ...
This paper presents a framework for ensuring adversarial robustness in in-context learning (ICL) for large language models, addressing th...
This paper presents a framework for certified learning under distribution shifts, focusing on sound verification and identifiable structu...
This article explores the effectiveness of identifying 'safety regions' in large language models (LLMs) by evaluating various methods acr...
The paper presents EXACT, a novel approach for decoding-time personalization in large language models, enhancing user alignment through i...
The paper introduces 'agentic unlearning,' a novel approach to remove sensitive information from both model parameters and memory in AI a...
This paper discusses reducing text bias in synthetically generated multiple-choice question answering (MCQA) for Vision Language Models (...
Anthropic's new AI tool, Claude Code Security, identifies hidden software vulnerabilities, causing significant market shifts in the cyber...
A lawsuit has been filed against OpenAI by AI injury attorneys, claiming that ChatGPT caused severe mental health issues, including psych...
Over 30 cities have terminated contracts with Flock Safety, an AI surveillance company, amid rising concerns over privacy and federal ove...
The article explores the environmental impact of AI chatbots, focusing on their water consumption during operation. It presents varying e...
The article discusses the engineering of a deterministic kill-switch for autonomous agents, emphasizing the importance of safety mechanis...
The author seeks endorsement for their arXiv paper on mechanistic interpretability, focusing on the geometric structure of residual updat...
The EU AI Act establishes the world's first comprehensive framework for regulating artificial intelligence, focusing on safety, transpare...
India is pursuing AI autonomy through a unique three-pillar strategy focused on democratizing AI, public-sector applications, and global ...
The article discusses the environmental impact of generative AI, highlighting its significant electricity and water consumption, and the ...
The Reddit discussion explores concerns about AI potentially replacing jobs in the future, prompting varied opinions on the impact of AI ...
The article discusses the ethical considerations of AI in healthcare, emphasizing the need for responsible implementation to meet patient...
The article discusses a defense company's development of AI agents designed for military applications, raising ethical concerns about aut...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime