[2603.26465] A Boltzmann-machine-enhanced Transformer For DNA Sequence

[2603.26465] A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification

arXiv - AI March 30, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.26465: A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification

Computer Science > Machine Learning arXiv:2603.26465 (cs) [Submitted on 27 Mar 2026] Title:A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification Authors:Zhixuan Cao, Yishu Xu, Xuang WU View a PDF of the paper titled A Boltzmann-machine-enhanced Transformer For DNA Sequence Classification, by Zhixuan Cao and Yishu Xu and Xuang WU View PDF HTML (experimental) Abstract:DNA sequence classification requires not only high predictive accuracy but also the ability to uncover latent site interactions, combinatorial regulation, and epistasis-like higher-order dependencies. Although the standard Transformer provides strong global modeling capacity, its softmax attention is continuous, dense, and weakly constrained, making it better suited for information routing than explicit structure discovery. In this paper, we propose a Boltzmann-machine-enhanced Transformer for DNA sequence classification. Built on multi-head attention, the model introduces structured binary gating variables to represent latent query-key connections and constrains them with a Boltzmann-style energy function. Query-key similarity defines local bias terms, learnable pairwise interactions capture synergy and competition between edges, and latent hidden units model higher-order combinatorial dependencies. Since exact posterior inference over discrete gating graphs is intractable, we use mean-field variational inference to estimate edge activation probabilities and combine it with Gumbel-Softmax to...

Originally published on March 30, 2026. Curated by AI News.

Machine Learning

[D] How does distributed proof of work computing handle the coordination needs of neural network training?

[D] Ive been trying to understand the technical setup of a project called Qubic. It claims to use distributed proof of work computing for...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[R] VLMs Behavior for Long Video Understanding

I have extensively searched on long video understanding datasets such as Video-MME, MLVU, VideoBench, LongVideoBench and etc. What I have...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

My AI spent last night modifying its own codebase

I've been working on a local AI system called Apis that runs completely offline through Ollama. During a background run, Apis identified ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

Fake users generated by AI can't simulate humans — review of 182 research papers. Your thoughts?

https://www.researchsquare.com/article/rs-9057643/v1 There’s a massive trend right now where tech companies, businesses, even researchers...