[2505.21605] SoSBench: Benchmarking Safety Alignment on Six Scientific

[2505.21605] SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

arXiv - AI April 07, 2026 4 min read

About this article

Abstract page for arXiv paper 2505.21605: SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

Computer Science > Machine Learning arXiv:2505.21605 (cs) [Submitted on 27 May 2025 (v1), last revised 5 Apr 2026 (this version, v3)] Title:SoSBench: Benchmarking Safety Alignment on Six Scientific Domains Authors:Fengqing Jiang, Fengbo Ma, Zhangchen Xu, Yuetai Li, Zixin Rao, Bhaskar Ramasubramanian, Luyao Niu, Bo Li, Xianyan Chen, Zhen Xiang, Radha Poovendran View a PDF of the paper titled SoSBench: Benchmarking Safety Alignment on Six Scientific Domains, by Fengqing Jiang and 10 other authors View PDF Abstract:Large language models (LLMs) exhibit advancing capabilities in complex tasks, such as reasoning and graduate-level question answering, yet their resilience against misuse, particularly involving scientifically sophisticated risks, remains underexplored. Existing safety benchmarks typically focus either on instructions requiring minimal knowledge comprehension (e.g., ``tell me how to build a bomb") or utilize prompts that are relatively low-risk (e.g., multiple-choice or classification tasks about hazardous content). Consequently, they fail to adequately assess model safety when handling knowledge-intensive, hazardous scenarios. To address this critical gap, we introduce SoSBench, a regulation-grounded, hazard-focused benchmark encompassing six high-risk scientific domains: chemistry, biology, medicine, pharmacology, physics, and psychology. The benchmark comprises 3,000 prompts derived from real-world regulations and laws, systematically expanded via an LLM-assiste...

Originally published on April 07, 2026. Curated by AI News.

Llms

Qwen3 4B outperforms cloud agents on code tasks—with Mahoraga research [R]

Hey everyone in ML. I've been working on Mahoraga, an open-source orchestrator that routes tasks across local and cloud AI agents using a...

Reddit - Machine Learning · 1 min · 37 minutes ago

Llms

Associative memory system for LLMs that learns during inference [P]

I've been working on MDA (Modular Dynamic Architecture), an online associative memory system for LLMs. Here's what I learned building it....

Reddit - Machine Learning · 1 min · about 6 hours ago

Llms

Things I got wrong building a confidence evaluator for local LLMs [D]

I've been building **Autodidact**, a local-first AI agent framework. The central piece is a **confidence evaluator** - something that dec...

Reddit - Machine Learning · 1 min · about 7 hours ago

Llms

I’m convinced 90% of you building "AI Agents" are just burning money on proxy providers. [D]

Seriously, I just audited my stack and realized I’m spending more on rotating residential proxies than I am on the actual Claude and Open...

Reddit - Machine Learning · 1 min · about 7 hours ago

[2505.21605] SoSBench: Benchmarking Safety Alignment on Six Scientific Domains

About this article

Related Articles

Qwen3 4B outperforms cloud agents on code tasks—with Mahoraga research [R]

Associative memory system for LLMs that learns during inference [P]

Things I got wrong building a confidence evaluator for local LLMs [D]

I’m convinced 90% of you building "AI Agents" are just burning money on proxy providers. [D]

No comments

Stay updated with AI News