AI Safety & Ethics

Alignment, bias, regulation, and responsible AI

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[2601.15356] Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

Abstract page for arXiv paper 2601.15356: Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

arXiv - AI · 4 min · about 4 hours ago

Llms

[2510.18196] Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

Abstract page for arXiv paper 2510.18196: Contrastive Decoding Mitigates Score Range Bias in LLM-as-a-Judge

arXiv - AI · 3 min · about 4 hours ago

Llms

[2509.23435] AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

Abstract page for arXiv paper 2509.23435: AudioRole: An Audio Dataset for Character Role-Playing in Large Language Models

arXiv - AI · 4 min · about 4 hours ago

All Content

Machine Learning

[2602.12963] Information-theoretic analysis of world models in optimal reward maximizers

This paper presents an information-theoretic analysis of world models in optimal reward maximizers, quantifying the information conveyed ...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.13093] Consistency of Large Reasoning Models Under Multi-Turn Attacks

This article evaluates the robustness of large reasoning models against multi-turn adversarial attacks, revealing vulnerabilities and pro...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.12748] X-SYS: A Reference Architecture for Interactive Explanation Systems

The article presents X-SYS, a reference architecture designed for interactive explanation systems in AI, addressing the challenges of dep...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12670] SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

The paper introduces SkillsBench, a benchmark assessing the effectiveness of agent skills across 86 tasks in 11 domains, revealing signif...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12665] Evaluating Robustness of Reasoning Models on Parameterized Logical Problems

This paper introduces a diagnostic benchmark for evaluating the robustness of reasoning models on parameterized logical problems, specifi...

arXiv - AI · 3 min · about 2 months ago

Ai Safety

[2602.12316] GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory

GT-HarmBench introduces a benchmark for evaluating AI safety risks in multi-agent environments, highlighting significant reliability gaps...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12356] A Theoretical Framework for Adaptive Utility-Weighted Benchmarking

This paper presents a theoretical framework for adaptive utility-weighted benchmarking in AI, emphasizing the importance of stakeholder p...

arXiv - AI · 4 min · about 2 months ago

Ai Safety

Job threats, rogue bots: five hot issues in AI

The article discusses five critical issues surrounding AI at the AI Impact Summit, including job displacement, rogue AI, energy demands, ...

AI Tools & Products · 5 min · about 2 months ago

Ai Safety

Anthropic donates $20 million to AI education and policy organization Public First Action

Anthropic has donated $20 million to Public First Action to promote AI education and policy, emphasizing the need for regulation amidst g...

AI Tools & Products · 5 min · about 2 months ago

Ai Safety

Ads in AI chatbots raise privacy concerns as companies seek new revenue

The introduction of ads in AI chatbots raises privacy concerns as companies like OpenAI and Microsoft explore new revenue models amidst u...

AI Tools & Products · 5 min · about 2 months ago

Ai Agents

Longtime NPR host David Greene sues Google over NotebookLM voice | TechCrunch

David Greene, former NPR host, is suing Google, claiming the voice in its NotebookLM tool mimics his own. This raises concerns about AI v...

TechCrunch - AI · 3 min · about 2 months ago

Ai Safety

Students to discuss ethics of artificial intelligence and its role in the workplace

Oglethorpe students will engage in discussions on the ethics of artificial intelligence and its workplace implications, featuring expert ...

AI News - General · 12 min · about 2 months ago

Open Source Ai

429 – Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.