Generative AI

Image, video, audio, and text generation

Top This Week

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation
Machine Learning

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

Abstract page for arXiv paper 2512.23994: PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-A...

arXiv - AI · 4 min ·
[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving
Llms

[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving

Abstract page for arXiv paper 2512.10785: Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Ev...

arXiv - AI · 4 min ·
[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling
Llms

[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling

Abstract page for arXiv paper 2510.13870: Unlocking the Potential of Diffusion Language Models through Template Infilling

arXiv - AI · 3 min ·

All Content

[2602.12526] Constraint-Rectified Training for Efficient Chain-of-Thought
Llms

[2602.12526] Constraint-Rectified Training for Efficient Chain-of-Thought

The paper presents Constraint-Rectified Training (CRT), a framework designed to enhance the efficiency of Chain-of-Thought reasoning in L...

arXiv - Machine Learning · 4 min ·
[2602.12468] Continuous Diffusion Models Can Obey Formal Syntax
Llms

[2602.12468] Continuous Diffusion Models Can Obey Formal Syntax

The paper introduces a method for guiding continuous diffusion models to adhere to formal syntactic constraints, achieving high constrain...

arXiv - Machine Learning · 3 min ·
[2602.12429] Stabilizing Native Low-Rank LLM Pretraining
Llms

[2602.12429] Stabilizing Native Low-Rank LLM Pretraining

This paper presents a method for stabilizing the training of low-rank large language models (LLMs), addressing computational challenges w...

arXiv - Machine Learning · 3 min ·
[2602.12394] Synthetic Interaction Data for Scalable Personalization in Large Language Models
Llms

[2602.12394] Synthetic Interaction Data for Scalable Personalization in Large Language Models

The paper introduces PersonaGym, a framework for generating synthetic interaction data to enhance personalization in large language model...

arXiv - Machine Learning · 4 min ·
[2602.12318] Abstractive Red-Teaming of Language Model Character
Llms

[2602.12318] Abstractive Red-Teaming of Language Model Character

This article presents a novel approach to auditing language model behavior through 'abstractive red-teaming,' identifying query types tha...

arXiv - Machine Learning · 4 min ·
[2602.12205] DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing
Machine Learning

[2602.12205] DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

DeepGen 1.0 is a lightweight unified multimodal model designed for image generation and editing, achieving competitive performance with o...

arXiv - AI · 4 min ·
[2602.11638] Variation-aware Flexible 3D Gaussian Editing
Computer Vision

[2602.11638] Variation-aware Flexible 3D Gaussian Editing

The paper presents VF-Editor, a novel approach for flexible 3D Gaussian editing that addresses limitations of indirect editing methods by...

arXiv - AI · 3 min ·
[2602.11287] HiFloat4 Format for Language Model Inference
Llms

[2602.11287] HiFloat4 Format for Language Model Inference

The paper introduces HiFloat4, a block floating-point format designed for deep learning, enhancing efficiency in language model inference...

arXiv - Machine Learning · 3 min ·
[2602.08676] LLaDA2.1: Speeding Up Text Diffusion via Token Editing
Machine Learning

[2602.08676] LLaDA2.1: Speeding Up Text Diffusion via Token Editing

LLaDA2.1 introduces a novel approach to text diffusion by integrating Token-to-Token editing into the Mask-to-Token scheme, enhancing bot...

arXiv - Machine Learning · 4 min ·
[2602.07738] Learnable Chernoff Baselines for Inference-Time Alignment
Machine Learning

[2602.07738] Learnable Chernoff Baselines for Inference-Time Alignment

The paper introduces Learnable Chernoff Baselines (LCBs) for efficient inference-time reward-guided alignment in generative models, impro...

arXiv - Machine Learning · 3 min ·
[2602.06771] AEGIS: Adversarial Target-Guided Retention-Data-Free Robust Concept Erasure from Diffusion Models
Machine Learning

[2602.06771] AEGIS: Adversarial Target-Guided Retention-Data-Free Robust Concept Erasure from Diffusion Models

The paper presents AEGIS, a novel framework for robust concept erasure in diffusion models, addressing the trade-off between robustness a...

arXiv - Machine Learning · 4 min ·
[2602.00737] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization
Machine Learning

[2602.00737] Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization

This article presents a novel framework called Pareto-Conditioned Diffusion (PCD) for offline multi-objective optimization, addressing ch...

arXiv - Machine Learning · 3 min ·
[2602.00020] Beyond Static Question Banks: Dynamic Knowledge Expansion via LLM-Automated Graph Construction and Adaptive Generation
Llms

[2602.00020] Beyond Static Question Banks: Dynamic Knowledge Expansion via LLM-Automated Graph Construction and Adaptive Generation

This paper presents a framework for dynamic knowledge expansion in personalized education, utilizing LLMs for automated graph constructio...

arXiv - AI · 4 min ·
[2601.21452] SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation
Machine Learning

[2601.21452] SAGE: Sequence-level Adaptive Gradient Evolution for Generative Recommendation

The paper presents SAGE, a new optimizer for generative recommendation systems that addresses limitations in existing methods by improvin...

arXiv - Machine Learning · 4 min ·
[2601.15673] Enhancing guidance for missing data in diffusion-based sequential recommendation
Generative Ai

[2601.15673] Enhancing guidance for missing data in diffusion-based sequential recommendation

This paper presents the Counterfactual Attention Regulation Diffusion model (CARD) to improve sequential recommendation systems by addres...

arXiv - AI · 4 min ·
[2512.18080] From Prompt to Product: A Human-Centered Benchmark of Agentic App Generation Systems
Nlp

[2512.18080] From Prompt to Product: A Human-Centered Benchmark of Agentic App Generation Systems

This paper introduces a human-centered benchmark for evaluating agentic app generation systems, comparing platforms like Replit, Bolt, an...

arXiv - AI · 4 min ·
[2512.15052] SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification
Llms

[2512.15052] SGM: Safety Glasses for Multimodal Large Language Models via Neuron-Level Detoxification

The paper presents SGM, a novel approach for detoxifying multimodal large language models (MLLMs) by recalibrating toxic neurons, signifi...

arXiv - AI · 4 min ·
[2511.02083] Watermarking Discrete Diffusion Language Models
Llms

[2511.02083] Watermarking Discrete Diffusion Language Models

This article presents a novel watermarking technique for discrete diffusion language models (DDLMs), addressing the need for reliable det...

arXiv - AI · 3 min ·
[2510.22747] Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study
Llms

[2510.22747] Low-Resource Dialect Adaptation of Large Language Models: A French Dialect Case-Study

This article explores the adaptation of large language models (LLMs) for low-resource dialects, focusing on the Québec French dialect usi...

arXiv - AI · 4 min ·
[2509.19852] Eliminating stability hallucinations in llm-based tts models via attention guidance
Llms

[2509.19852] Eliminating stability hallucinations in llm-based tts models via attention guidance

This paper addresses stability hallucinations in LLM-based TTS models by enhancing attention mechanisms, proposing a new alignment metric...

arXiv - AI · 3 min ·
Previous Page 106 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime