Generative AI

Image, video, audio, and text generation

Top This Week

[2603.17677] Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models
Llms

[2603.17677] Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models

Abstract page for arXiv paper 2603.17677: Adaptive Guidance for Retrieval-Augmented Masked Diffusion Models

arXiv - Machine Learning · 3 min ·
[2601.16933] Reward-Forcing: Autoregressive Video Generation with Reward Feedback
Machine Learning

[2601.16933] Reward-Forcing: Autoregressive Video Generation with Reward Feedback

Abstract page for arXiv paper 2601.16933: Reward-Forcing: Autoregressive Video Generation with Reward Feedback

arXiv - Machine Learning · 3 min ·
[2505.15263] gen2seg: Generative Models Enable Generalizable Instance Segmentation
Machine Learning

[2505.15263] gen2seg: Generative Models Enable Generalizable Instance Segmentation

Abstract page for arXiv paper 2505.15263: gen2seg: Generative Models Enable Generalizable Instance Segmentation

arXiv - Machine Learning · 3 min ·

All Content

[2509.22237] FeatBench: Towards More Realistic Evaluation of Feature-level Code Generation
Llms

[2509.22237] FeatBench: Towards More Realistic Evaluation of Feature-level Code Generation

The paper introduces FeatBench, a new benchmark for evaluating feature-level code generation in Large Language Models (LLMs), addressing ...

arXiv - AI · 4 min ·
[2602.02958] Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization
Machine Learning

[2602.02958] Quant VideoGen: Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

The paper presents Quant VideoGen, a framework for autoregressive long video generation that addresses the limitations of KV cache memory...

arXiv - Machine Learning · 4 min ·
[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies
Llms

[2509.19680] PolicyPad: Collaborative Prototyping of LLM Policies

The article presents PolicyPad, an interactive system designed for collaborative prototyping of policies governing large language models ...

arXiv - AI · 3 min ·
[2602.00191] GEPC: Group-Equivariant Posterior Consistency for Out-of-Distribution Detection in Diffusion Models
Machine Learning

[2602.00191] GEPC: Group-Equivariant Posterior Consistency for Out-of-Distribution Detection in Diffusion Models

The paper introduces Group-Equivariant Posterior Consistency (GEPC), a method for detecting out-of-distribution data in diffusion models ...

arXiv - Machine Learning · 4 min ·
[2512.18454] Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs
Machine Learning

[2512.18454] Out-of-Distribution Detection in Molecular Complexes via Diffusion Models for Irregular Graphs

This paper presents a novel framework for out-of-distribution (OOD) detection in molecular complexes using diffusion models tailored for ...

arXiv - Machine Learning · 4 min ·
[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
Machine Learning

[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

The paper presents FreqPolicy, a novel flow-based visuomotor policy that enhances efficiency in robotic manipulation by imposing frequenc...

arXiv - AI · 4 min ·
[2510.25867] Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs
Machine Learning

[2510.25867] Synthesizing High-Quality Visual Question Answering from Medical Documents with Generator-Verifier LMMs

This paper presents MedVLSynther, a framework for synthesizing high-quality visual question answering (VQA) from medical documents, enhan...

arXiv - Machine Learning · 4 min ·
[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes
Llms

[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

This article presents a novel approach combining Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to improve rare disease ...

arXiv - AI · 4 min ·
[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation
Generative Ai

[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation

This survey reviews advancements in spatiotemporal consistency in video generation, addressing challenges and methodologies in creating c...

arXiv - AI · 4 min ·
[2509.22007] Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models
Machine Learning

[2509.22007] Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models

This paper explores the dynamics of Classifier-Free Guidance (CFG) in diffusion models, revealing its effects on sampling processes and d...

arXiv - Machine Learning · 4 min ·
[2509.00454] Universal Properties of Activation Sparsity in Modern Large Language Models
Llms

[2509.00454] Universal Properties of Activation Sparsity in Modern Large Language Models

This article explores the universal properties of activation sparsity in modern large language models (LLMs), highlighting its implicatio...

arXiv - Machine Learning · 4 min ·
[2508.11810] FairTabGen: High-Fidelity and Fair Synthetic Health Data Generation from Limited Samples
Llms

[2508.11810] FairTabGen: High-Fidelity and Fair Synthetic Health Data Generation from Limited Samples

FairTabGen introduces a novel framework for generating high-fidelity synthetic healthcare data from limited samples, enhancing fairness a...

arXiv - Machine Learning · 3 min ·
[2501.16534] Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs
Llms

[2501.16534] Targeting Alignment: Extracting Safety Classifiers of Aligned LLMs

This article presents a novel technique for extracting safety classifiers from aligned large language models (LLMs) to address vulnerabil...

arXiv - AI · 4 min ·
[2501.03544] PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models
Machine Learning

[2501.03544] PromptGuard: Soft Prompt-Guided Unsafe Content Moderation for Text-to-Image Models

PromptGuard introduces a novel method for moderating unsafe content in text-to-image models, enhancing safety without sacrificing image q...

arXiv - AI · 4 min ·
[2411.11706] MC-LLaVA: Multi-Concept Personalized Vision-Language Model
Llms

[2411.11706] MC-LLaVA: Multi-Concept Personalized Vision-Language Model

The paper presents MC-LLaVA, a multi-concept personalized vision-language model that enhances user experience by integrating multiple con...

arXiv - AI · 4 min ·
[2409.17091] Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification
Machine Learning

[2409.17091] Ctrl-GenAug: Controllable Generative Augmentation for Medical Sequence Classification

The paper presents Ctrl-GenAug, a novel framework for controllable generative augmentation in medical sequence classification, addressing...

arXiv - Machine Learning · 4 min ·
[2401.04536] Evaluating Language Model Agency through Negotiations
Llms

[2401.04536] Evaluating Language Model Agency through Negotiations

This paper introduces a novel method for evaluating language model agency through negotiation games, addressing limitations of existing b...

arXiv - Machine Learning · 3 min ·
[2506.14202] DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation
Machine Learning

[2506.14202] DiffusionBlocks: Block-wise Neural Network Training via Diffusion Interpretation

The paper introduces DiffusionBlocks, a framework for block-wise training of neural networks that reduces memory bottlenecks while mainta...

arXiv - Machine Learning · 4 min ·
[2602.05088] VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health
Generative Ai

[2602.05088] VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health

The article presents VERA-MH, an open-source evaluation tool designed to assess the safety of AI in mental health contexts, focusing on s...

arXiv - AI · 4 min ·
[2602.00663] SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent
Llms

[2602.00663] SEISMO: Increasing Sample Efficiency in Molecular Optimization with a Trajectory-Aware LLM Agent

The paper presents SEISMO, a trajectory-aware LLM agent designed to enhance sample efficiency in molecular optimization, achieving signif...

arXiv - Machine Learning · 4 min ·
Previous Page 76 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime