Generative AI

Image, video, audio, and text generation

Top This Week

[2601.03127] Unified Thinker: A General Reasoning Modular Core for Image Generation
Machine Learning

[2601.03127] Unified Thinker: A General Reasoning Modular Core for Image Generation

Abstract page for arXiv paper 2601.03127: Unified Thinker: A General Reasoning Modular Core for Image Generation

arXiv - AI · 4 min ·
[2601.08845] No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scope in Symbolic and Generative AI
Generative Ai

[2601.08845] No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scope in Symbolic and Generative AI

Abstract page for arXiv paper 2601.08845: No Universal Hyperbola: A Formal Disproof of the Epistemic Trade-Off Between Certainty and Scop...

arXiv - AI · 4 min ·
[2511.12834] SAGA: Source Attribution of Generative AI Videos
Machine Learning

[2511.12834] SAGA: Source Attribution of Generative AI Videos

Abstract page for arXiv paper 2511.12834: SAGA: Source Attribution of Generative AI Videos

arXiv - AI · 4 min ·

All Content

[2512.18454] Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs
Machine Learning

[2512.18454] Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs

This paper presents a novel framework for out-of-distribution (OOD) detection in molecular complexes using diffusion models tailored for ...

arXiv - Machine Learning · 4 min ·
[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
Machine Learning

[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

The paper presents FreqPolicy, a novel flow-based visuomotor policy that enhances efficiency in robotic manipulation by imposing frequenc...

arXiv - AI · 4 min ·
[2510.25867] Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs
Machine Learning

[2510.25867] Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

This paper presents MedVLSynther, a framework for synthesizing high-quality visual question answering (VQA) from medical documents, enhan...

arXiv - Machine Learning · 4 min ·
[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes
Llms

[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

This article presents a novel approach combining Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to improve rare disease ...

arXiv - AI · 4 min ·
[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation
Generative Ai

[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation

This survey reviews advancements in spatiotemporal consistency in video generation, addressing challenges and methodologies in creating c...

arXiv - AI · 4 min ·
[2509.22007] Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models
Machine Learning

[2509.22007] Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models

This paper explores the dynamics of Classifier-Free Guidance (CFG) in diffusion models, revealing its effects on sampling processes and d...

arXiv - Machine Learning · 4 min ·
[2509.00454] Universal Properties of Activation Sparsity in Modern Large Language Models
Llms

[2509.00454] Universal Properties of Activation Sparsity in Modern Large Language Models

This article explores the universal properties of activation sparsity in modern large language models (LLMs), highlighting its implicatio...

arXiv - Machine Learning · 4 min ·
[2508.11810] FairTabGen: High-Fidelity and Fair Synthetic Health Data Generation from Limited Samples
Llms

[2508.11810] FairTabGen: High-Fidelity and Fair Synthetic Health Data Generation from Limited Samples

FairTabGen introduces a novel framework for generating high-fidelity synthetic healthcare data from limited samples, enhancing fairness a...

arXiv - Machine Learning · 3 min ·
[2501.16534] Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
Llms

[2501.16534] Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs

This article presents a novel technique for extracting safety classifiers from aligned large language models (LLMs) to address vulnerabil...

arXiv - AI · 4 min ·
[2501.03544] PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models
Machine Learning

[2501.03544] PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models

PromptGuard introduces a novel method for moderating unsafe content in text-to-image models, enhancing safety without sacrificing image q...

arXiv - AI · 4 min ·
[2411.11706] MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Llms

[2411.11706] MC-LLaVA: Multi-Concept Personalized Vision-Language Model

The paper presents MC-LLaVA, a multi-concept personalized vision-language model that enhances user experience by integrating multiple con...

arXiv - AI · 4 min ·
[2409.17091] Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Machine Learning

[2409.17091] Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification

The paper presents Ctrl-GenAug, a novel framework for controllable generative augmentation in medical sequence classification, addressing...

arXiv - Machine Learning · 4 min ·
[2401.04536] Evaluating Language Model Agency through Negotiations
Llms

[2401.04536] Evaluating Language Model Agency through Negotiations

This paper introduces a novel method for evaluating language model agency through negotiation games, addressing limitations of existing b...

arXiv - Machine Learning · 3 min ·
[2506.14202] DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
Machine Learning

[2506.14202] DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

The paper introduces DiffusionBlocks, a framework for block-wise training of neural networks that reduces memory bottlenecks while mainta...

arXiv - Machine Learning · 4 min ·
[2602.05088] VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health
Generative Ai

[2602.05088] VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health

The article presents VERA-MH, an open-source evaluation tool designed to assess the safety of AI in mental health contexts, focusing on s...

arXiv - AI · 4 min ·
[2602.00663] SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent
Llms

[2602.00663] SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

The paper presents SEISMO, a trajectory-aware LLM agent designed to enhance sample efficiency in molecular optimization, achieving signif...

arXiv - Machine Learning · 4 min ·
[2502.17356] Random Scaling of Emergent Capabilities
Llms

[2502.17356] Random Scaling of Emergent Capabilities

This article explores the phenomenon of emergent capabilities in language models, proposing that performance breakthroughs are influenced...

arXiv - Machine Learning · 4 min ·
[2510.12121] Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing
Llms

[2510.12121] Precise Attribute Intensity Control in Large Language Models via Targeted Representation Editing

This paper introduces a method for precise control of attribute intensities in Large Language Models (LLMs) through targeted representati...

arXiv - Machine Learning · 4 min ·
[2507.03267] GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning
Machine Learning

[2507.03267] GDGB: A Benchmark for Generative Dynamic Text-Attributed Graph Learning

The paper presents GDGB, a benchmark for Generative Dynamic Text-Attributed Graph Learning, addressing the limitations of existing datase...

arXiv - AI · 4 min ·
[2602.16699] Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents
Llms

[2602.16699] Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

The paper presents a framework called Calibrate-Then-Act (CTA) that enables LLMs to optimize decision-making by balancing cost and uncert...

arXiv - AI · 4 min ·
Previous Page 77 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime