Generative AI

Image, video, audio, and text generation

Top This Week

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation
Machine Learning

[2512.23994] PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-Audio-Video Generation

Abstract page for arXiv paper 2512.23994: PhyAVBench: A Challenging Audio Physics-Sensitivity Benchmark for Physically Grounded Text-to-A...

arXiv - AI · 4 min ·
[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving
Llms

[2512.10785] Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Evidence-Centered Design for Supporting Physics Problem Solving

Abstract page for arXiv paper 2512.10785: Developing and Evaluating a Large Language Model-Based Automated Feedback System Grounded in Ev...

arXiv - AI · 4 min ·
[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling
Llms

[2510.13870] Unlocking the Potential of Diffusion Language Models through Template Infilling

Abstract page for arXiv paper 2510.13870: Unlocking the Potential of Diffusion Language Models through Template Infilling

arXiv - AI · 3 min ·

All Content

[2509.14832] Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization
Machine Learning

[2509.14832] Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization

The paper presents a Diffusion Scenario Tree (DST) framework for multivariate time series prediction and multistage stochastic optimizati...

arXiv - Machine Learning · 4 min ·
[2508.12685] ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction
Llms

[2508.12685] ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

ToolACE-MT introduces a non-autoregressive framework for generating high-quality multi-turn dialogues in agentic interactions, enhancing ...

arXiv - Machine Learning · 3 min ·
[2508.05004] R-Zero: Self-Evolving Reasoning LLM from Zero Data
Llms

[2508.05004] R-Zero: Self-Evolving Reasoning LLM from Zero Data

The article presents R-Zero, a self-evolving reasoning LLM that autonomously generates training data, improving AI capabilities without h...

arXiv - Machine Learning · 4 min ·
[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models
Llms

[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

This article introduces the Haerae Evaluation Toolkit (HRET), a unified framework for evaluating the capabilities of Korean language mode...

arXiv - AI · 4 min ·
[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving
Llms

[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving

The paper presents PlanetServe, a decentralized overlay for scalable and privacy-preserving serving of large language models (LLMs), addr...

arXiv - AI · 4 min ·
[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference
Machine Learning

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

The paper introduces Cross-Attention Token Pruning (CATP), a method designed to enhance the accuracy of multimodal models by effectively ...

arXiv - AI · 3 min ·
[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
Llms

[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

This paper introduces Selective Abstraction (SA), a framework for improving the reliability of long-form text generated by LLMs by select...

arXiv - Machine Learning · 4 min ·
[2602.11807] PuYun-LDM: A Latent Diffusion Model for High-Resolution Ensemble Weather Forecasts
Llms

[2602.11807] PuYun-LDM: A Latent Diffusion Model for High-Resolution Ensemble Weather Forecasts

The paper presents PuYun-LDM, a novel latent diffusion model designed to enhance high-resolution ensemble weather forecasts, addressing c...

arXiv - AI · 4 min ·
[2512.12182] TA-KAND: Two-stage Attention Triple Enhancement and U-KAN based Diffusion For Few-shot Knowledge Graph Completion
Generative Ai

[2512.12182] TA-KAND: Two-stage Attention Triple Enhancement and U-KAN based Diffusion For Few-shot Knowledge Graph Completion

The paper presents TA-KAND, a novel framework for few-shot knowledge graph completion that employs a two-stage attention mechanism and U-...

arXiv - Machine Learning · 3 min ·
[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models
Llms

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

The paper presents RLIE, a framework that integrates large language models (LLMs) with probabilistic rule learning to enhance rule genera...

arXiv - AI · 4 min ·
[2509.11079] Difficulty-Aware Agentic Orchestration for Query-Specific Multi-Agent Workflows
Llms

[2509.11079] Difficulty-Aware Agentic Orchestration for Query-Specific Multi-Agent Workflows

The paper presents Difficulty-Aware Agentic Orchestration (DAAO), a novel framework for optimizing multi-agent workflows based on query d...

arXiv - AI · 3 min ·
[2508.11850] EvoCut: Strengthening Integer Programs via Evolution-Guided Language Models
Llms

[2508.11850] EvoCut: Strengthening Integer Programs via Evolution-Guided Language Models

EvoCut automates the generation of acceleration cuts for integer programming, significantly improving solver performance by leveraging ev...

arXiv - AI · 4 min ·
[2507.04103] How to Train Your LLM Web Agent: A Statistical Diagnosis
Llms

[2507.04103] How to Train Your LLM Web Agent: A Statistical Diagnosis

This article presents a statistical approach to training LLM-based web agents, addressing challenges in multi-step interactions and compu...

arXiv - Machine Learning · 4 min ·
[2505.23381] AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning
Machine Learning

[2505.23381] AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

AutoGPS introduces a neuro-symbolic framework for solving geometry problems, enhancing reliability and interpretability through multimoda...

arXiv - AI · 3 min ·
[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation
Llms

[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

The paper presents SCAN, a novel approach for Semantic Document Layout Analysis that enhances Retrieval-Augmented Generation (RAG) system...

arXiv - AI · 4 min ·
[2412.16543] Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI
Llms

[2412.16543] Mathematics and Machine Creativity: A Survey on Bridging Mathematics with AI

This paper surveys the intersection of mathematics and AI, highlighting how AI can enhance mathematical research and the need for better ...

arXiv - AI · 4 min ·
[2602.13071] Bus-Conditioned Zero-Shot Trajectory Generation via Task Arithmetic
Machine Learning

[2602.13071] Bus-Conditioned Zero-Shot Trajectory Generation via Task Arithmetic

This paper introduces MobTA, a novel approach for generating mobility trajectories without requiring real data from the target city, usin...

arXiv - Machine Learning · 4 min ·
[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation
Machine Learning

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

The paper introduces Diverging Flows, a method for detecting extrapolations in conditional generation models, enhancing safety in applica...

arXiv - Machine Learning · 3 min ·
[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation
Machine Learning

[2602.13055] Curriculum-DPO++: Direct Preference Optimization via Data and Model Curricula for Text-to-Image Generation

The paper presents Curriculum-DPO++, an advanced method for text-to-image generation that optimizes preference learning through a dual cu...

arXiv - Machine Learning · 4 min ·
[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments
Llms

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

This paper presents a strategic framework for governments to decide between buying or building large language models (LLMs) for public se...

arXiv - AI · 4 min ·
Previous Page 107 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime