[2602.11661] Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm
About this article
Abstract page for arXiv paper 2602.11661: Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm
Computer Science > Artificial Intelligence arXiv:2602.11661 (cs) [Submitted on 12 Feb 2026 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm Authors:Tianxiang Xu, Jiayi Liu, Yixuan Tong, Jialu Xu, Yunqing Wei, Kaiwen Feng, PanPan Hou, Kangping Yin, Jiyuan Hu, Hao Zhou, Zhenxin Ma, Jian Xu, Guanjun Jiang View a PDF of the paper titled Quark Medical Alignment: A Holistic Multi-Dimensional Alignment and Collaborative Optimization Paradigm, by Tianxiang Xu and 12 other authors View PDF Abstract:While reinforcement learning for large language model alignment has progressed rapidly in recent years, transferring these paradigms to high-stakes medical question answering reveals a fundamental paradigm mismatch. Reinforcement Learning from Human Feedback relies on preference annotations that are prohibitively expensive and often fail to reflect the absolute correctness of medical facts. Reinforcement Learning from Verifiable Rewards lacks effective automatic verifiers and struggles to handle complex clinical contexts. Meanwhile, medical alignment requires the simultaneous optimization of correctness, safety, and compliance, yet multi-objective heterogeneous reward signals are prone to scale mismatch and optimization conflicts. To address these challenges, we propose a robust medical alignment paradigm. We first construct a holistic multi-dimensional medical alignment matrix ...