The public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors
A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only qu...
Alignment, bias, regulation, and responsible AI
A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only qu...
A Reuters report outlines China's proposed regulations on the rapidly expanding sector of digital humans and AI avatars. Under the new dr...
Abstract page for arXiv paper 2512.00408: Low-Bitrate Video Compression through Semantic-Conditioned Diffusion
This paper introduces SpikeScore, a novel method for detecting hallucinations in multi-turn dialogues across different domains, enhancing...
The paper introduces MAVIS, a framework for aligning large language models (LLMs) to multiple objectives at inference time, enhancing fle...
This article examines the regulatory gaps in AI deployment within organizations, highlighting issues that allow internal systems to evade...
This article discusses the importance of explainable AI (XAI) in enhancing trust and accountability in AI applications, particularly in s...
The paper presents DeepLight, a novel deep learning architecture designed for predicting lightning occurrences by addressing the limitati...
This article investigates the phenomenon of self-initiated deception in Large Language Models (LLMs) when responding to benign prompts, h...
The paper discusses a novel approach called recontextualization, which aims to reduce specification gaming in language models without alt...
This paper presents a probabilistic method to measure the representativeness of scenario suites for autonomous systems, focusing on ensur...
The paper introduces NeuronSeek, a framework that enhances the stability and expressivity of task-driven neurons in deep learning through...
The paper presents HCLA, a human-centered multi-agent system designed for detecting anomalies in digital asset transactions, enhancing in...
This paper introduces a novel safety measure, time-to-unsafe-sampling, for evaluating generative models, focusing on predicting unsafe ou...
The paper presents SAFER, a two-stage risk control framework for large language models (LLMs) that enhances output trustworthiness in ris...
This article examines how AI maintains higher strategic tension in chess compared to human players, revealing insights into decision-maki...
The paper discusses optimal selective classification using likelihood ratios, enhancing predictive model reliability by allowing abstenti...
This article explores the use of Large Language Models (LLMs) as tools for improving ontology alignment, demonstrating their effectivenes...
This paper presents a novel approach to prevent negative transfer in transfer learning by integrating residual features from pretrained m...
This article evaluates the persuasive capabilities of frontier large language models (LLMs) on harmful topics, introducing a new benchmar...
This paper explores the capabilities of large language models (LLMs) in counterfactual reasoning through a decompositional approach, iden...
This article presents Robust Multi-Objective Decoding (RMOD), an innovative algorithm designed to enhance the performance of Large Langua...
This article discusses a novel approach to improving large language model (LLM) alignment through effective preference data selection, en...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime