Generative AI

Image, video, audio, and text generation

Top This Week

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control
Generative Ai

[2602.08277] PISCO: Precise Video Instance Insertion with Sparse Control

Abstract page for arXiv paper 2602.08277: PISCO: Precise Video Instance Insertion with Sparse Control

arXiv - AI · 4 min ·
[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images
Machine Learning

[2511.18746] Any4D: Open-Prompt 4D Generation from Natural Language and Images

Abstract page for arXiv paper 2511.18746: Any4D: Open-Prompt 4D Generation from Natural Language and Images

arXiv - AI · 4 min ·
[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting
Llms

[2512.14549] Dual-objective Language Models: Training Efficiency Without Overfitting

Abstract page for arXiv paper 2512.14549: Dual-objective Language Models: Training Efficiency Without Overfitting

arXiv - AI · 3 min ·

All Content

[2603.00589] AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution
Machine Learning

[2603.00589] AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution

Abstract page for arXiv paper 2603.00589: AlignVAR: Towards Globally Consistent Visual Autoregression for Image Super-Resolution

arXiv - AI · 4 min ·
[2603.00576] Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation
Machine Learning

[2603.00576] Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation

Abstract page for arXiv paper 2603.00576: Efficient Long-Sequence Diffusion Modeling for Symbolic Music Generation

arXiv - AI · 4 min ·
[2603.00530] Bridge Matching Sampler: Scalable Sampling via Generalized Fixed-Point Diffusion Matching
Machine Learning

[2603.00530] Bridge Matching Sampler: Scalable Sampling via Generalized Fixed-Point Diffusion Matching

Abstract page for arXiv paper 2603.00530: Bridge Matching Sampler: Scalable Sampling via Generalized Fixed-Point Diffusion Matching

arXiv - Machine Learning · 3 min ·
[2603.00521] Phys-Diff: A Physics-Inspired Latent Diffusion Model for Tropical Cyclone Forecasting
Machine Learning

[2603.00521] Phys-Diff: A Physics-Inspired Latent Diffusion Model for Tropical Cyclone Forecasting

Abstract page for arXiv paper 2603.00521: Phys-Diff: A Physics-Inspired Latent Diffusion Model for Tropical Cyclone Forecasting

arXiv - Machine Learning · 3 min ·
[2603.00492] ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models
Machine Learning

[2603.00492] ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

Abstract page for arXiv paper 2603.00492: ArtiFixer: Enhancing and Extending 3D Reconstruction with Auto-Regressive Diffusion Models

arXiv - Machine Learning · 4 min ·
[2603.00483] RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment
Machine Learning

[2603.00483] RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment

Abstract page for arXiv paper 2603.00483: RAISE: Requirement-Adaptive Evolutionary Refinement for Training-Free Text-to-Image Alignment

arXiv - AI · 4 min ·
[2603.00423] An Interpretable Local Editing Model for Counterfactual Medical Image Generation
Machine Learning

[2603.00423] An Interpretable Local Editing Model for Counterfactual Medical Image Generation

Abstract page for arXiv paper 2603.00423: An Interpretable Local Editing Model for Counterfactual Medical Image Generation

arXiv - AI · 3 min ·
[2603.00205] Efficient Flow Matching for Sparse-View CT Reconstruction
Machine Learning

[2603.00205] Efficient Flow Matching for Sparse-View CT Reconstruction

Abstract page for arXiv paper 2603.00205: Efficient Flow Matching for Sparse-View CT Reconstruction

arXiv - AI · 4 min ·
[2603.00194] SKeDA: A Generative Watermarking Framework for Text-to-video Diffusion Models
Machine Learning

[2603.00194] SKeDA: A Generative Watermarking Framework for Text-to-video Diffusion Models

Abstract page for arXiv paper 2603.00194: SKeDA: A Generative Watermarking Framework for Text-to-video Diffusion Models

arXiv - AI · 4 min ·
[2603.00181] Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease
Machine Learning

[2603.00181] Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease

Abstract page for arXiv paper 2603.00181: Engineering FAIR Privacy-preserving Applications that Learn Histories of Disease

arXiv - Machine Learning · 3 min ·
[2603.00180] NNiT: Width-Agnostic Neural Network Generation with Structurally Aligned Weight Spaces
Machine Learning

[2603.00180] NNiT: Width-Agnostic Neural Network Generation with Structurally Aligned Weight Spaces

Abstract page for arXiv paper 2603.00180: NNiT: Width-Agnostic Neural Network Generation with Structurally Aligned Weight Spaces

arXiv - Machine Learning · 3 min ·
[2603.00166] Exploring the AI Obedience: Why is Generating a Pure Color Image Harder than CyberPunk?
Machine Learning

[2603.00166] Exploring the AI Obedience: Why is Generating a Pure Color Image Harder than CyberPunk?

Abstract page for arXiv paper 2603.00166: Exploring the AI Obedience: Why is Generating a Pure Color Image Harder than CyberPunk?

arXiv - AI · 3 min ·
[2603.00159] FlowPortrait: Reinforcement Learning for Audio-Driven Portrait Video Generation
Generative Ai

[2603.00159] FlowPortrait: Reinforcement Learning for Audio-Driven Portrait Video Generation

Abstract page for arXiv paper 2603.00159: FlowPortrait: Reinforcement Learning for Audio-Driven Portrait Video Generation

arXiv - AI · 3 min ·
[2603.00149] Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction
Machine Learning

[2603.00149] Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Correction

Abstract page for arXiv paper 2603.00149: Physics-Consistent Diffusion for Efficient Fluid Super-Resolution via Multiscale Residual Corre...

arXiv - AI · 3 min ·
[2603.00141] From Scale to Speed: Adaptive Test-Time Scaling for Image Editing
Machine Learning

[2603.00141] From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

Abstract page for arXiv paper 2603.00141: From Scale to Speed: Adaptive Test-Time Scaling for Image Editing

arXiv - AI · 4 min ·
[2603.00140] Steering Away from Memorization: Reachability-Constrained Reinforcement Learning for Text-to-Image Diffusion
Machine Learning

[2603.00140] Steering Away from Memorization: Reachability-Constrained Reinforcement Learning for Text-to-Image Diffusion

Abstract page for arXiv paper 2603.00140: Steering Away from Memorization: Reachability-Constrained Reinforcement Learning for Text-to-Im...

arXiv - Machine Learning · 3 min ·
[2603.00133] You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models
Machine Learning

[2603.00133] You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion Models

Abstract page for arXiv paper 2603.00133: You Don't Need All That Attention: Surgical Memorization Mitigation in Text-to-Image Diffusion ...

arXiv - AI · 4 min ·
[2603.00122] NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intelligence
Machine Learning

[2603.00122] NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intelligence

Abstract page for arXiv paper 2603.00122: NovaLAD: A Fast, CPU-Optimized Document Extraction Pipeline for Generative AI and Data Intellig...

arXiv - AI · 4 min ·
[2603.00059] Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replicate a Human Survey with Synthetic Data
Llms

[2603.00059] Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replicate a Human Survey with Synthetic Data

Abstract page for arXiv paper 2603.00059: Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replic...

arXiv - AI · 4 min ·
[2603.00057] "Bespoke Bots": Diverse Instructor Needs for Customizing Generative AI Classroom Chatbots
Generative Ai

[2603.00057] "Bespoke Bots": Diverse Instructor Needs for Customizing Generative AI Classroom Chatbots

Abstract page for arXiv paper 2603.00057: "Bespoke Bots": Diverse Instructor Needs for Customizing Generative AI Classroom Chatbots

arXiv - AI · 3 min ·
Previous Page 21 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime