[2603.26465] A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification
About this article
Abstract page for arXiv paper 2603.26465: A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification
Computer Science > Machine Learning arXiv:2603.26465 (cs) [Submitted on 27 Mar 2026] Title:A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification Authors:Zhixuan Cao, Yishu Xu, Xuang WU View a PDF of the paper titled A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification, by Zhixuan Cao and Yishu Xu and Xuang WU View PDF HTML (experimental) Abstract:DNA sequence classification requires not only high predictive accuracy but also the ability to uncover latent site interactions, combinatorial regulation, and epistasis-like higher-order dependencies. Although the standard Transformer provides strong global modeling capacity, its softmax attention is continuous, dense, and weakly constrained, making it better suited for information routing than explicit structure discovery. In this paper, we propose a Boltzmann-machine-enhanced Transformer for DNA sequence classification. Built on multi-head attention, the model introduces structured binary gating variables to represent latent query-key connections and constrains them with a Boltzmann-style energy function. Query-key similarity defines local bias terms, learnable pairwise interactions capture synergy and competition between edges, and latent hidden units model higher-order combinatorial dependencies. Since exact posterior inference over discrete gating graphs is intractable, we use mean-field variational inference to estimate edge activation probabilities and combine it with Gumbel-Softmax to...