[2604.04403] MolDA: Molecular Understanding and Generation via Large Language Diffusion Model
About this article
Abstract page for arXiv paper 2604.04403: MolDA: Molecular Understanding and Generation via Large Language Diffusion Model
Computer Science > Artificial Intelligence arXiv:2604.04403 (cs) [Submitted on 6 Apr 2026] Title:MolDA: Molecular Understanding and Generation via Large Language Diffusion Model Authors:Seohyeon Shin, HanJun Choi, Jun-Hyung Park, Hongkook Kim, Mansu Kim View a PDF of the paper titled MolDA: Molecular Understanding and Generation via Large Language Diffusion Model, by Seohyeon Shin and 4 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have significantly advanced molecular discovery, but existing multimodal molecular architectures fundamentally rely on autoregressive (AR) backbones. This strict left-to-right inductive bias is sub-optimal for generating chemically valid molecules, as it struggles to account for non-local global constraints (e.g., ring closures) and often accumulates structural errors during sequential generation. To address these limitations, we propose MolDA (Molecular language model with masked Diffusion with mAsking), a novel multimodal framework that replaces the conventional AR backbone with a discrete Large Language Diffusion Model. MolDA extracts comprehensive structural representations using a hybrid graph encoder, which captures both local and global topologies, and aligns them into the language token space via a Q-Former. Furthermore, we mathematically reformulate Molecular Structure Preference Optimization specifically for the masked diffusion. Through bidirectional iterative denoising, MolDA ensures global structur...