[2603.00888] Probabilistic Learning and Generation in Deep Sequence Models

[2603.00888] Probabilistic Learning and Generation in Deep Sequence Models

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.00888: Probabilistic Learning and Generation in Deep Sequence Models

Computer Science > Machine Learning arXiv:2603.00888 (cs) [Submitted on 1 Mar 2026] Title:Probabilistic Learning and Generation in Deep Sequence Models Authors:Wenlong Chen View a PDF of the paper titled Probabilistic Learning and Generation in Deep Sequence Models, by Wenlong Chen View PDF Abstract:Despite exceptional predictive performance of Deep sequence models (DSMs), the main concern of their deployment centers around the lack of uncertainty awareness. In contrast, probabilistic models quantify the uncertainty associated with unobserved variables with rules of probability. Notably, Bayesian methods leverage Bayes' rule to express our belief of unobserved variables in a principled way. Since exact Bayesian inference is computationally infeasible at scale, approximate inference is required in practice. Two major bottlenecks of Bayesian methods, especially when applied in deep neural networks, are prior specification and approximation quality. In Chapter 3 & 4, we investigate how the architectures of DSMs themselves can be informative for the design of priors or approximations in probabilistic models. We first develop an approximate Bayesian inference method tailored to the Transformer based on the similarity between attention and sparse Gaussian process. Next, we exploit the long-range memory preservation capability of HiPPOs (High-order Polynomial Projection Operators) to construct an interdomain inducing point for Gaussian process, which successfully memorizes the hi...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

The Download: gig workers training humanoids, and better AI benchmarks | MIT Technology Review
Machine Learning

The Download: gig workers training humanoids, and better AI benchmarks | MIT Technology Review

OpenAI has closed Silicon Valley's largest-ever funding round.

MIT Technology Review - AI · 6 min ·
Machine Learning

[D] How do ML engineers view vibe coding?

I've seen, read and heard a lot of mixed reactions about software engineers (ie. the ones who aren't building ML models and make purely d...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I built a simple gpu-aware single-node job scheduler for researchers / students

(reposting in my main account because anonymous account cannot post here.) Hi everyone! I’m a research engineer from a small lab in Asia,...

Reddit - Machine Learning · 1 min ·
Llms

[For Hire] Junior AI/ML Engineer | RAG · LLMs · FastAPI · Vector DBs | Remote

Posting this for a friend who isn't on Reddit. A recent graduate, entry level, no commercial production experience but spent the past yea...

Reddit - ML Jobs · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime