An Introduction to AI Secure LLM Safety Leaderboard

Hugging Face Blog February 15, 2026 6 min read

About this article

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Back to Articles An Introduction to AI Secure LLM Safety Leaderboard Published January 26, 2024 Update on GitHub Upvote 6 Chenhui Zhang danielz01 Follow guest Chulin Xie alphapav Follow guest Mintong Kang Cometkmt Follow guest Chejian Xu chejian Follow guest Bo Li BoLi-aisecure Follow guest Given the widespread adoption of LLMs, it is critical to understand their safety and risks in different scenarios before extensive deployments in the real world. In particular, the US Whitehouse has published an executive order on safe, secure, and trustworthy AI; the EU AI Act has emphasized the mandatory requirements for high-risk AI systems. Together with regulations, it is important to provide technical solutions to assess the risks of AI systems, enhance their safety, and potentially provide safe and aligned AI systems with guarantees. Thus, in 2023, at Secure Learning Lab, we introduced DecodingTrust, the first comprehensive and unified evaluation platform dedicated to assessing the trustworthiness of LLMs. (This work won the Outstanding Paper Award at NeurIPS 2023.) DecodingTrust provides a multifaceted evaluation framework covering eight trustworthiness perspectives: toxicity, stereotype bias, adversarial robustness, OOD robustness, robustness on adversarial demonstrations, privacy, machine ethics, and fairness. In particular, DecodingTrust 1) offers comprehensive trustworthiness perspectives for a holistic trustworthiness evaluation, 2) provides novel red-teaming algorithms tai...

Originally published on February 15, 2026. Curated by AI News.

Open Source Ai

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

A Blog post by IBM Granite on Hugging Face

Hugging Face Blog · 7 min · about 6 hours ago

Llms

My AI spent last night modifying its own codebase

I've been working on a local AI system called Apis that runs completely offline through Ollama. During a background run, Apis identified ...

Reddit - Artificial Intelligence · 1 min · about 11 hours ago

Llms

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

TL;DR: Removing the right transformer layers (instead of shrinking all layers) gives smaller, faster models with minimal quality loss — a...

Reddit - Artificial Intelligence · 1 min · about 13 hours ago

Llms

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

Abstract page for arXiv paper 2603.16430: EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv - AI · 4 min · about 14 hours ago

An Introduction to AI Secure LLM Safety Leaderboard

About this article

Related Articles

Granite 4.0 3B Vision: Compact Multimodal Intelligence for Enterprise Documents

My AI spent last night modifying its own codebase

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

No comments

Stay updated with AI News