Generative AI

Image, video, audio, and text generation

Top This Week

Machine Learning

AI video generation seems fundamentally more expensive than text, not just less optimized

There’s been a lot of discussion recently about how expensive AI video generation is compared to text, and it feels like this is more tha...

Reddit - Artificial Intelligence · 1 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion
Machine Learning

[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

Abstract page for arXiv paper 2603.10202: Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Ap...

arXiv - Machine Learning · 4 min ·

All Content

[2602.20057] AdaWorldPolicy: World-Model-Driven Diffusion Policy with Online Adaptive Learning for Robotic Manipulation
Machine Learning

[2602.20057] AdaWorldPolicy: World-Model-Driven Diffusion Policy with Online Adaptive Learning for Robotic Manipulation

The paper presents AdaWorldPolicy, a novel framework for robotic manipulation that utilizes world models and online adaptive learning to ...

arXiv - AI · 4 min ·
[2602.20040] AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization
Llms

[2602.20040] AgenticSum: An Agentic Inference-Time Framework for Faithful Clinical Text Summarization

AgenticSum presents a novel framework for improving clinical text summarization using large language models, focusing on reducing factual...

arXiv - AI · 3 min ·
[2602.20122] NanoKnow: How to Know What Your Language Model Knows
Llms

[2602.20122] NanoKnow: How to Know What Your Language Model Knows

The article discusses NanoKnow, a benchmark dataset designed to understand how large language models (LLMs) acquire knowledge, using the ...

arXiv - Machine Learning · 4 min ·
[2602.19948] Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming
Llms

[2602.19948] Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

This article presents a framework for assessing the risks associated with using large language models (LLMs) in mental health support, hi...

arXiv - AI · 4 min ·
[2602.19946] When Pretty Isn't Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators
Machine Learning

[2602.19946] When Pretty Isn't Useful: Investigating Why Modern Text-to-Image Models Fail as Reliable Training Data Generators

This paper investigates the limitations of modern text-to-image models as reliable training data generators, revealing a decline in class...

arXiv - AI · 4 min ·
[2602.19843] MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems
Llms

[2602.19843] MAS-FIRE: Fault Injection and Reliability Evaluation for LLM-Based Multi-Agent Systems

The paper presents MAS-FIRE, a framework for evaluating the reliability of LLM-based Multi-Agent Systems through fault injection, address...

arXiv - AI · 4 min ·
[2602.19816] Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling
Machine Learning

[2602.19816] Depth-Structured Music Recurrence: Budgeted Recurrent Attention for Full-Piece Symbolic Music Modeling

The paper presents Depth-Structured Music Recurrence (DSMR), a novel approach for symbolic music modeling that optimizes long-context pro...

arXiv - Machine Learning · 4 min ·
[2602.19718] Carbon-Aware Governance Gates: An Architecture for Sustainable GenAI Development
Generative Ai

[2602.19718] Carbon-Aware Governance Gates: An Architecture for Sustainable GenAI Development

The paper proposes Carbon-Aware Governance Gates (CAGG) to integrate sustainability into Generative AI development, addressing the increa...

arXiv - AI · 3 min ·
[2602.19614] Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering
Llms

[2602.19614] Workflow-Level Design Principles for Trustworthy GenAI in Automotive System Engineering

This article presents workflow-level design principles for integrating trustworthy Generative AI in automotive system engineering, addres...

arXiv - Machine Learning · 3 min ·
[2602.19600] Manifold-Aligned Generative Transport
Machine Learning

[2602.19600] Manifold-Aligned Generative Transport

The paper presents Manifold-Aligned Generative Transport (MAGT), a novel generative model that efficiently samples from high-dimensional ...

arXiv - Machine Learning · 3 min ·
[2602.19631] Localized Concept Erasure in Text-to-Image Diffusion Models via High-Level Representation Misdirection
Machine Learning

[2602.19631] Localized Concept Erasure in Text-to-Image Diffusion Models via High-Level Representation Misdirection

This article discusses a novel approach to concept erasure in text-to-image diffusion models, focusing on High-Level Representation Misdi...

arXiv - AI · 4 min ·
[2602.19623] PedaCo-Gen: Scaffolding Pedagogical Agency in Human-AI Collaborative Video Authoring
Machine Learning

[2602.19623] PedaCo-Gen: Scaffolding Pedagogical Agency in Human-AI Collaborative Video Authoring

PedaCo-Gen is a novel AI system designed to enhance the quality of instructional video creation by integrating pedagogical principles and...

arXiv - AI · 3 min ·
[2602.19506] Relational Feature Caching for Accelerating Diffusion Transformers
Machine Learning

[2602.19506] Relational Feature Caching for Accelerating Diffusion Transformers

This paper introduces Relational Feature Caching (RFC) to enhance the efficiency of diffusion transformers by improving feature predictio...

arXiv - Machine Learning · 4 min ·
[2602.19461] Laplacian Multi-scale Flow Matching for Generative Modeling
Machine Learning

[2602.19461] Laplacian Multi-scale Flow Matching for Generative Modeling

The paper presents Laplacian Multi-scale Flow Matching (LapFlow), a new framework for image generative modeling that enhances flow matchi...

arXiv - Machine Learning · 3 min ·
[2602.19574] CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment
Llms

[2602.19574] CTC-TTS: LLM-based dual-streaming text-to-speech with CTC alignment

The paper presents CTC-TTS, a novel dual-streaming text-to-speech system that utilizes a CTC-based aligner for improved text-speech align...

arXiv - AI · 3 min ·
[2602.19565] DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces
Generative Ai

[2602.19565] DICArt: Advancing Category-level Articulated Object Pose Estimation in Discrete State-Spaces

DICArt introduces a novel framework for category-level articulated object pose estimation, utilizing a discrete diffusion process to enha...

arXiv - AI · 4 min ·
[2602.19538] Cost-Aware Diffusion Active Search
Generative Ai

[2602.19538] Cost-Aware Diffusion Active Search

The paper presents a novel approach to active search using cost-aware diffusion models, improving efficiency in decision-making for auton...

arXiv - Machine Learning · 4 min ·
[2602.19534] Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial
Llms

[2602.19534] Large Language Model-Assisted UAV Operations and Communications: A Multifaceted Survey and Tutorial

This article surveys the integration of Large Language Models (LLMs) in Uncrewed Aerial Vehicles (UAVs), exploring their potential to enh...

arXiv - AI · 4 min ·
[2602.19467] Can Large Language Models Replace Human Coders? Introducing ContentBench
Llms

[2602.19467] Can Large Language Models Replace Human Coders? Introducing ContentBench

This article introduces ContentBench, a benchmark suite assessing the ability of low-cost large language models (LLMs) to perform interpr...

arXiv - AI · 4 min ·
[2602.19239] Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations
Llms

[2602.19239] Attention Deficits in Language Models: Causal Explanations for Procedural Hallucinations

This article investigates procedural hallucinations in language models, identifying specific attention deficits that lead to errors in ex...

arXiv - Machine Learning · 4 min ·
Previous Page 56 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime