[2602.14318] In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes
Summary
The paper examines the trustworthiness of transformer architectures in high-stakes applications, analyzing their reliability, interpretability, and risks associated with their deployment across various domains.
Why It Matters
As transformer models become integral to critical fields such as healthcare and autonomous systems, understanding their limitations and vulnerabilities is essential for ensuring safe and effective deployment. This paper addresses significant concerns regarding their reliability and the implications for safety in various applications.
Key Takeaways
- Transformers have revolutionized multiple domains but raise trust issues.
- The paper reviews interpretability, robustness, and fairness of transformers.
- Identifies structural vulnerabilities and domain-specific risks.
- Highlights the need for rigorous evaluation in safety-critical applications.
- Calls for further research to address open challenges in transformer reliability.
Computer Science > Machine Learning arXiv:2602.14318 (cs) [Submitted on 15 Feb 2026] Title:In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes Authors:Trishit Mondal, Ameya D. Jagtap View a PDF of the paper titled In Transformer We Trust? A Perspective on Transformer Architecture Failure Modes, by Trishit Mondal and Ameya D. Jagtap View PDF HTML (experimental) Abstract:Transformer architectures have revolutionized machine learning across a wide range of domains, from natural language processing to scientific computing. However, their growing deployment in high-stakes applications, such as computer vision, natural language processing, healthcare, autonomous systems, and critical areas of scientific computing including climate modeling, materials discovery, drug discovery, nuclear science, and robotics, necessitates a deeper and more rigorous understanding of their trustworthiness. In this work, we critically examine the foundational question: \textitHow trustworthy are transformer models?} We evaluate their reliability through a comprehensive review of interpretability, explainability, robustness against adversarial attacks, fairness, and privacy. We systematically examine the trustworthiness of transformer-based models in safety-critical applications spanning natural language processing, computer vision, and science and engineering domains, including robotics, medicine, earth sciences, materials science, fluid dynamics, nuclear science, and aut...