Llms Machine Learning Ai Safety Ai Agents

[2602.23239] Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive

arXiv - AI February 27, 2026 4 min read Article

Summary

This paper explores the limitations of optimization-based AI systems, arguing that they cannot be norm-responsive due to inherent architectural constraints, particularly in Large Language Models trained via Reinforcement Learning from Human Feedback.

Why It Matters

As AI systems are increasingly integrated into critical sectors, understanding their limitations in adhering to normative frameworks is essential. This paper highlights fundamental architectural issues that prevent optimization-based systems from being genuinely accountable, which is crucial for ethical AI deployment.

Key Takeaways

Optimization-based systems lack the architectural capacity for normative governance.
Genuine agency in AI requires maintaining non-negotiable constraints and a mechanism for boundary suspension.
Failure modes like sycophancy and hallucination are structural issues, not mere bugs.
The Convergence Crisis poses a risk of degrading human oversight into mere metric-checking.
A new architectural specification is proposed for defining genuine agency across systems.

Computer Science > Artificial Intelligence arXiv:2602.23239 (cs) [Submitted on 26 Feb 2026] Title:Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive Authors:Radha Sarma View a PDF of the paper titled Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive, by Radha Sarma View PDF Abstract:AI systems are increasingly deployed in high-stakes contexts -- medical diagnosis, legal research, financial analysis -- under the assumption they can be governed by norms. This paper demonstrates that assumption is formally invalid for optimization-based systems, specifically Large Language Models trained via Reinforcement Learning from Human Feedback (RLHF). We establish that genuine agency requires two necessary and jointly sufficient architectural conditions: the capacity to maintain certain boundaries as non-negotiable constraints rather than tradeable weights (Incommensurability), and a non-inferential mechanism capable of suspending processing when those boundaries are threatened (Apophatic Responsiveness). These conditions apply across all normative domains. RLHF-based systems are constitutively incompatible with both conditions. The operations that make optimization powerful -- unifying all values on a scalar metric and always selecting the highest-scoring output -- are precisely the operations that preclude normative governance. This incompatibility is not a correctable training bug awaiting a technical fix...

Read Original Article

[2602.23239] Agency and Architectural Limits: Why Optimization-Based Systems Cannot Be Norm-Responsive

Summary

Why It Matters

Key Takeaways

Related Articles

Artificial intelligence will always depends on human otherwise it will be obsolete.

My AI spent last night modifying its own codebase

Fake users generated by AI can't simulate humans — review of 182 research papers. Your thoughts?

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

No comments

Stay updated with AI News