[2603.20396] Compression is all you need: Modeling Mathematics
About this article
Abstract page for arXiv paper 2603.20396: Compression is all you need: Modeling Mathematics
Computer Science > Artificial Intelligence arXiv:2603.20396 (cs) [Submitted on 20 Mar 2026] Title:Compression is all you need: Modeling Mathematics Authors:Vitaly Aksenov, Eve Bodnia, Michael H. Freedman, Michael Mulligan View a PDF of the paper titled Compression is all you need: Modeling Mathematics, by Vitaly Aksenov and 3 other authors View PDF HTML (experimental) Abstract:Human mathematics (HM), the mathematics humans discover and value, is a vanishingly small subset of formal mathematics (FM), the totality of all valid deductions. We argue that HM is distinguished by its compressibility through hierarchically nested definitions, lemmas, and theorems. We model this with monoids. A mathematical deduction is a string of primitive symbols; a definition or theorem is a named substring or macro whose use compresses the string. In the free abelian monoid $A_n$, a logarithmically sparse macro set achieves exponential expansion of expressivity. In the free non-abelian monoid $F_n$, even a polynomially-dense macro set only yields linear expansion; superlinear expansion requires near-maximal density. We test these models against MathLib, a large Lean~4 library of mathematics that we take as a proxy for HM. Each element has a depth (layers of definitional nesting), a wrapped length (tokens in its definition), and an unwrapped length (primitive symbols after fully expanding all references). We find unwrapped length grows exponentially with both depth and wrapped length; wrapped l...