[2601.21895] Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text
About this article
Abstract page for arXiv paper 2601.21895: Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text
Computer Science > Computation and Language arXiv:2601.21895 (cs) [Submitted on 29 Jan 2026 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text Authors:Hongyi Zhou, Jin Zhu, Kai Ye, Ying Yang, Erhan Xu, Chengchun Shi View a PDF of the paper titled Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text, by Hongyi Zhou and 5 other authors View PDF HTML (experimental) Abstract:Modern large language models (LLMs) such as GPT, Claude, and Gemini have transformed the way we learn, work, and communicate. Yet, their ability to produce highly human-like text raises serious concerns about misinformation and academic integrity, making it an urgent need for reliable algorithms to detect LLM-generated content. In this paper, we start by presenting a geometric approach to demystify rewrite-based detection algorithms, revealing their underlying rationale and demonstrating their generalization ability. Building on this insight, we introduce a novel rewrite-based detection algorithm that adaptively learns the distance between the original and rewritten text. Theoretically, we demonstrate that employing an adaptively learned distance function is more effective for detection than using a fixed distance. Empirically, we conduct extensive experiments with over 100 settings, and find that our approach demonstrates superior performance over baseline algorithms in the majority of scenarios. In particular, it ...