Generative AI

Image, video, audio, and text generation

Top This Week

Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
[2603.12057] Coarse-Guided Visual Generation via Weighted h-Transform Sampling
Machine Learning

[2603.12057] Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Abstract page for arXiv paper 2603.12057: Coarse-Guided Visual Generation via Weighted h-Transform Sampling

arXiv - AI · 4 min ·
[2603.07455] Image Generation Models: A Technical History
Machine Learning

[2603.07455] Image Generation Models: A Technical History

Abstract page for arXiv paper 2603.07455: Image Generation Models: A Technical History

arXiv - AI · 3 min ·

All Content

[2509.25369] Generative Value Conflicts Reveal LLM Priorities
Llms

[2509.25369] Generative Value Conflicts Reveal LLM Priorities

This paper introduces ConflictScope, a tool for evaluating how large language models (LLMs) prioritize conflicting values, revealing insi...

arXiv - Machine Learning · 4 min ·
[2509.24597] Inducing Dyslexia in Vision Language Models
Llms

[2509.24597] Inducing Dyslexia in Vision Language Models

The paper explores how vision-language models can simulate dyslexia by disrupting word processing mechanisms, providing insights into rea...

arXiv - Machine Learning · 4 min ·
[2504.12522] Evaluating the Diversity and Quality of LLM Generated Content
Llms

[2504.12522] Evaluating the Diversity and Quality of LLM Generated Content

This article evaluates the diversity and quality of content generated by large language models (LLMs), highlighting the trade-offs betwee...

arXiv - AI · 4 min ·
[2509.19929] Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later
Machine Learning

[2509.19929] Geometric Autoencoder Priors for Bayesian Inversion: Learn First Observe Later

The paper presents Geometric Autoencoders for Bayesian Inversion (GABI), a novel framework for uncertainty quantification in engineering,...

arXiv - Machine Learning · 4 min ·
[2508.12691] Adaptive Hybrid Caching for Efficient Text-to-Video Diffusion Model Acceleration
Machine Learning

[2508.12691] Adaptive Hybrid Caching for Efficient Text-to-Video Diffusion Model Acceleration

This paper presents MixCache, a novel caching framework designed to enhance the efficiency of text-to-video diffusion models, significant...

arXiv - Machine Learning · 4 min ·
[2508.04228] LayerT2V: A Unified Multi-Layer Video Generation Framework
Machine Learning

[2508.04228] LayerT2V: A Unified Multi-Layer Video Generation Framework

LayerT2V presents a novel framework for multi-layer video generation, enabling the creation of editable video layers that enhance profess...

arXiv - Machine Learning · 4 min ·
[2502.02088] Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation
Machine Learning

[2502.02088] Dual-IPO: Dual-Iterative Preference Optimization for Text-to-Video Generation

The paper presents Dual-IPO, a novel framework for optimizing text-to-video generation by iteratively improving both the reward and video...

arXiv - AI · 4 min ·
[2411.08254] Toward Automated Validation of Language Model Synthesized Test Cases using Semantic Entropy
Llms

[2411.08254] Toward Automated Validation of Language Model Synthesized Test Cases using Semantic Entropy

The paper presents VALTEST, a framework for validating test cases generated by large language models (LLMs) using semantic entropy, impro...

arXiv - AI · 4 min ·
[2509.24276] G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge
Llms

[2509.24276] G-reasoner: Foundation Models for Unified Reasoning over Graph-structured Knowledge

The G-reasoner paper introduces a unified framework that enhances reasoning over graph-structured knowledge using a new graph foundation ...

arXiv - AI · 4 min ·
[2509.07706] FHIR-RAG-MEDS: Integrating HL7 FHIR with Retrieval-Augmented Large Language Models for Enhanced Medical Decision Support
Llms

[2509.07706] FHIR-RAG-MEDS: Integrating HL7 FHIR with Retrieval-Augmented Large Language Models for Enhanced Medical Decision Support

The paper presents FHIR-RAG-MEDS, a system that integrates HL7 FHIR with Retrieval-Augmented Generation models to enhance personalized me...

arXiv - AI · 3 min ·
[2504.13359] Cost-of-Pass: An Economic Framework for Evaluating Language Models
Llms

[2504.13359] Cost-of-Pass: An Economic Framework for Evaluating Language Models

The paper presents an economic framework for evaluating language models by analyzing the tradeoff between performance and inference costs...

arXiv - AI · 4 min ·
[2412.17287] LLM4AD: A Platform for Algorithm Design with Large Language Model
Llms

[2412.17287] LLM4AD: A Platform for Algorithm Design with Large Language Model

LLM4AD introduces a unified Python platform for algorithm design using large language models, featuring modular components for various ta...

arXiv - AI · 3 min ·
[2602.23359] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation
Machine Learning

[2602.23359] SeeThrough3D: Occlusion Aware 3D Control in Text-to-Image Generation

The paper introduces SeeThrough3D, a model for occlusion-aware 3D control in text-to-image generation, enhancing the realism of synthesiz...

arXiv - AI · 4 min ·
[2510.05725] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies
Llms

[2510.05725] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies

This article presents a novel approach to improving masked diffusion models (MDMs) for language modeling by introducing a learned schedul...

arXiv - Machine Learning · 4 min ·
[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model
Llms

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

This article presents rBridge, a small proxy model that predicts reasoning performance in large language models (LLMs), demonstrating sig...

arXiv - Machine Learning · 4 min ·
[2602.23228] MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction
Llms

[2602.23228] MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction

The paper presents MovieTeller, a novel framework for generating movie synopses using tool-augmented progressive abstraction to enhance c...

arXiv - AI · 4 min ·
[2508.03587] Zero-Variance Gradients for Variational Autoencoders
Machine Learning

[2508.03587] Zero-Variance Gradients for Variational Autoencoders

This paper introduces a novel approach called Silent Gradients for training Variational Autoencoders (VAEs), which eliminates gradient es...

arXiv - Machine Learning · 4 min ·
[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation
Machine Learning

[2508.01101] Fast and Flexible Probabilistic Forecasting of Dynamical Systems using Flow Matching and Physical Perturbation

This article presents a novel framework for probabilistic forecasting of dynamical systems, utilizing flow matching and physical perturba...

arXiv - Machine Learning · 4 min ·
[2602.23225] Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?
Llms

[2602.23225] Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

This paper investigates why Diffusion Language Models (DLMs) often default to autoregressive decoding instead of utilizing their potentia...

arXiv - AI · 4 min ·
[2602.23203] ColoDiff: Integrating Dynamic Consistency With Content Awareness for Colonoscopy Video Generation
Generative Ai

[2602.23203] ColoDiff: Integrating Dynamic Consistency With Content Awareness for Colonoscopy Video Generation

ColoDiff introduces a novel framework for generating colonoscopy videos that ensures dynamic consistency and content awareness, addressin...

arXiv - AI · 4 min ·
Previous Page 30 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime