[2601.03127] Unified Thinker: A General Reasoning Modular Core for Image Generation
Abstract page for arXiv paper 2601.03127: Unified Thinker: A General Reasoning Modular Core for Image Generation
Image, video, audio, and text generation
Abstract page for arXiv paper 2601.03127: Unified Thinker: A General Reasoning Modular Core for Image Generation
Abstract page for arXiv paper 2601.08845: No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scop...
Abstract page for arXiv paper 2511.12834: SAGA: Source Attribution of Generative AI Videos
This paper presents a novel framework for out-of-distribution (OOD) detection in molecular complexes using diffusion models tailored for ...
The paper presents FreqPolicy, a novel flow-based visuomotor policy that enhances efficiency in robotic manipulation by imposing frequenc...
This paper presents MedVLSynther, a framework for synthesizing high-quality visual question answering (VQA) from medical documents, enhan...
This article presents a novel approach combining Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to improve rare disease ...
This survey reviews advancements in spatiotemporal consistency in video generation, addressing challenges and methodologies in creating c...
This paper explores the dynamics of Classifier-Free Guidance (CFG) in diffusion models, revealing its effects on sampling processes and d...
This article explores the universal properties of activation sparsity in modern large language models (LLMs), highlighting its implicatio...
FairTabGen introduces a novel framework for generating high-fidelity synthetic healthcare data from limited samples, enhancing fairness a...
This article presents a novel technique for extracting safety classifiers from aligned large language models (LLMs) to address vulnerabil...
PromptGuard introduces a novel method for moderating unsafe content in text-to-image models, enhancing safety without sacrificing image q...
The paper presents MC-LLaVA, a multi-concept personalized vision-language model that enhances user experience by integrating multiple con...
The paper presents Ctrl-GenAug, a novel framework for controllable generative augmentation in medical sequence classification, addressing...
This paper introduces a novel method for evaluating language model agency through negotiation games, addressing limitations of existing b...
The paper introduces DiffusionBlocks, a framework for block-wise training of neural networks that reduces memory bottlenecks while mainta...
The article presents VERA-MH, an open-source evaluation tool designed to assess the safety of AI in mental health contexts, focusing on s...
The paper presents SEISMO, a trajectory-aware LLM agent designed to enhance sample efficiency in molecular optimization, achieving signif...
This article explores the phenomenon of emergent capabilities in language models, proposing that performance breakthroughs are influenced...
This paper introduces a method for precise control of attribute intensities in Large Language Models (LLMs) through targeted representati...
The paper presents GDGB, a benchmark for Generative Dynamic Text-Attributed Graph Learning, addressing the limitations of existing datase...
The paper presents a framework called Calibrate-Then-Act (CTA) that enables LLMs to optimize decision-making by balancing cost and uncert...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime