Generative AI

Image, video, audio, and text generation

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

Abstract page for arXiv paper 2512.23994: PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-A...

arXiv - AI · 4 min · about 11 hours ago

Llms

[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving

Abstract page for arXiv paper 2512.10785: Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Ev...

arXiv - AI · 4 min · about 11 hours ago

Llms

[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling

Abstract page for arXiv paper 2510.13870: Unlocking the Potential of Diffusion Language Models through Template Infilling

arXiv - AI · 3 min · about 11 hours ago

All Content

Llms

[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

This article presents a novel meta-cognitive framework aimed at enhancing knowledge augmentation in Large Language Models (LLMs), address...

arXiv - AI · 3 min · about 2 months ago

Robotics

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

This article examines how the availability of knowledge influences the persuasiveness of generative social agents (GSAs) in physiotherapy...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12873] Knowledge-Based Design Requirements for Generative Social Robots in Higher Education

The article explores design requirements for generative social robots in higher education, emphasizing the need for knowledge-based frame...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12846] Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models

The paper presents Amortized Reasoning Tree Search (ARTS), a novel approach to enhance reasoning in Large Language Models by decoupling p...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.12829] FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching

The paper presents FLAC, a novel framework for Maximum Entropy Reinforcement Learning that utilizes kinetic energy regularization to opti...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.12763] "Not Human, Funnier": How Machine Identity Shapes Humor Perception in Online AI Stand-up Comedy

This article explores how AI's machine identity influences humor perception in online stand-up comedy, revealing that AI can be perceived...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12705] MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

MedXIAOHE is a medical vision-language foundation model that enhances medical understanding and reasoning in clinical applications, achie...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.12675] SLA2: Sparse-Linear Attention with Learnable Routing and QAT

The paper presents SLA2, an advanced Sparse-Linear Attention model that enhances video generation efficiency by introducing a learnable r...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.12642] Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

This article presents a novel approach to reinforcement learning by reinterpreting the partition function as a difficulty scheduler, enha...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12574] Monte Carlo Tree Search with Reasoning Path Refinement for Small Language Models in Conversational Text-to-NoSQL

This paper presents a novel framework, Stage-MCTS, which enhances small language models' ability to generate NoSQL queries through conver...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12470] Designing RNAs with Language Models

The paper presents a novel approach to RNA design using language models, reframing the task as conditional sequence generation, which sig...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.12424] RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty

The paper introduces RankLLM, a framework for evaluating large language models (LLMs) by quantifying question difficulty, enhancing model...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.12393] Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models

This article presents a reproducibility study of DragDiffusion, a method for interactive point-based image editing using diffusion models...

arXiv - Machine Learning · 4 min · about 2 months ago

Nlp

[2602.12311] Perceptual Self-Reflection in Agentic Physics Simulation Code Generation

This article presents a multi-agent framework for generating physics simulation code from natural language descriptions, introducing a no...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.12304] OmniCustom: Sync Audio-Video Customization Via Joint Audio-Video Generation Model

The paper introduces OmniCustom, a novel framework for synchronizing audio-video customization, enhancing identity and timbre fidelity th...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13093] Consistency of Large Reasoning Models Under Multi-Turn Attacks

This article evaluates the robustness of large reasoning models against multi-turn adversarial attacks, revealing vulnerabilities and pro...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12586] Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

This paper introduces McDiffuSE, a Monte Carlo Tree Search framework aimed at optimizing slot filling orders in Masked Diffusion Models, ...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.12566] To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

This paper explores the effectiveness of multi-domain reinforcement learning for large language models, comparing mixed multi-task traini...

arXiv - AI · 4 min · about 2 months ago

Llms

Customizable AI Companions.

The article discusses the potential of customizable AI companions that can engage in real-time video calls, leveraging technologies like ...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Llms

Qwen3.5 vs DeepSeek — which matters more?

The discussion compares Qwen3.5 and DeepSeek, two AI models released around the same time, highlighting user excitement and potential app...

Reddit - Artificial Intelligence · 1 min · about 2 months ago

Previous Page 108 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Generative AI

Top This Week

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving

[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling

All Content

[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

[2602.12873] Knowledge-Based Design Requirements for Generative Social Robots in Higher Education

[2602.12846] Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models

[2602.12829] FLAC: Maximum Entropy RL via Kinetic Energy Regularized Bridge Matching

[2602.12763] "Not Human, Funnier": How Machine Identity Shapes Humor Perception in Online AI Stand-up Comedy

[2602.12705] MedXIAOHE: A Comprehensive Recipe for Building Medical MLLMs

[2602.12675] SLA2: Sparse-Linear Attention with Learnable Routing and QAT

[2602.12642] Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

[2602.12574] Monte Carlo Tree Search with Reasoning Path Refinement for Small Language Models in Conversational Text-to-NoSQL

[2602.12470] Designing RNAs with Language Models

[2602.12424] RankLLM: Weighted Ranking of LLMs by Quantifying Question Difficulty

[2602.12393] Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models

[2602.12311] Perceptual Self-Reflection in Agentic Physics Simulation Code Generation

[2602.12304] OmniCustom: Sync Audio-Video Customization Via Joint Audio-Video Generation Model

[2602.13093] Consistency of Large Reasoning Models Under Multi-Turn Attacks

[2602.12586] Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

[2602.12566] To Mix or To Merge: Toward Multi-Domain Reinforcement Learning for Large Language Models

Customizable AI Companions.

Qwen3.5 vs DeepSeek — which matters more?

Related Topics

Stay updated with AI News