[2603.02259] The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety
About this article
Abstract page for arXiv paper 2603.02259: The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety
Computer Science > Multiagent Systems arXiv:2603.02259 (cs) [Submitted on 28 Feb 2026] Title:The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety Authors:Elias Malomgré, Pieter Simoens View a PDF of the paper titled The Alignment Flywheel: A Governance-Centric Hybrid MAS for Architecture-Agnostic Safety, by Elias Malomgr\'e and 1 other authors View PDF HTML (experimental) Abstract:Multi-agent systems provide mature methodologies for role decomposition, coordination, and normative governance, capabilities that remain essential as increasingly powerful autonomous decision components are embedded within agent-based systems. While learned and generative models substantially expand system capability, their safety behavior is often entangled with training, making it opaque, difficult to audit, and costly to update after deployment. This paper formalizes the Alignment Flywheel as a governance-centric hybrid MAS architecture that decouples decision generation from safety governance. A Proposer, representing any autonomous decision component, generates candidate trajectories, while a Safety Oracle returns raw safety signals through a stable interface. An enforcement layer applies explicit risk policy at runtime, and a governance MAS supervises the Oracle through auditing, uncertainty-driven verification, and versioned refinement. The central engineering principle is patch locality: many newly observed safety failures can be mitigated by updating ...