Generative AI

Image, video, audio, and text generation

Top This Week

Generative Ai

Will Generative AI apps remain a revenue powerhouse in 2026?

AI Tools & Products · 1 min ·
[2601.08565] Rewriting Video: Text-Driven Reauthoring of Video Footage
Machine Learning

[2601.08565] Rewriting Video: Text-Driven Reauthoring of Video Footage

Abstract page for arXiv paper 2601.08565: Rewriting Video: Text-Driven Reauthoring of Video Footage

arXiv - AI · 3 min ·
[2512.18388] Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models
Machine Learning

[2512.18388] Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creation with Generative Models

Abstract page for arXiv paper 2512.18388: Exploration vs. Fixation: Scaffolding Divergent and Convergent Thinking for Human-AI Co-Creatio...

arXiv - AI · 4 min ·

All Content

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer
Machine Learning

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

The paper presents CoCoDiff, a novel framework for fine-grained style transfer in images, emphasizing semantic correspondence and achievi...

arXiv - AI · 3 min ·
[2602.14433] Synthetic Reader Panels: Tournament-Based Ideation with LLM Personas for Autonomous Publishing
Llms

[2602.14433] Synthetic Reader Panels: Tournament-Based Ideation with LLM Personas for Autonomous Publishing

The paper discusses a novel system for autonomous book ideation using synthetic reader panels composed of LLM personas to evaluate book c...

arXiv - AI · 4 min ·
[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion
Generative Ai

[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion

This article presents an adaptation of VACE for real-time autoregressive video generation, enhancing video control while addressing laten...

arXiv - AI · 3 min ·
[2602.14374] Differentially Private Retrieval-Augmented Generation
Llms

[2602.14374] Differentially Private Retrieval-Augmented Generation

The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addr...

arXiv - AI · 4 min ·
[2602.14270] A Rational Analysis of the Effects of Sycophantic AI
Llms

[2602.14270] A Rational Analysis of the Effects of Sycophantic AI

This article analyzes the impact of sycophantic AI on human belief systems, revealing how overly agreeable AI can distort reality and inf...

arXiv - AI · 3 min ·
[2602.14237] AbracADDbra: Touch-Guided Object Addition by Decoupling Placement and Editing Subtasks
Machine Learning

[2602.14237] AbracADDbra: Touch-Guided Object Addition by Decoupling Placement and Editing Subtasks

The paper presents AbracADDbra, a framework that enhances object addition in computer vision by decoupling placement and editing tasks th...

arXiv - AI · 3 min ·
[2602.14211] SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement
Ai Agents

[2602.14211] SkillJect: Automating Stealthy Skill-Based Prompt Injection for Coding Agents with Trace-Driven Closed-Loop Refinement

The paper presents SkillJect, an automated framework for stealthy skill-based prompt injection in coding agents, addressing security vuln...

arXiv - AI · 4 min ·
[2602.14189] Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
Llms

[2602.14189] Knowing When Not to Answer: Abstention-Aware Scientific Reasoning

The paper discusses an abstention-aware framework for scientific reasoning, emphasizing the importance of knowing when to abstain from an...

arXiv - AI · 4 min ·
[2602.14188] GPT-5 vs Other LLMs in Long Short-Context Performance
Llms

[2602.14188] GPT-5 vs Other LLMs in Long Short-Context Performance

This paper evaluates the performance of GPT-5 and other LLMs on long short-context tasks, revealing significant gaps between theoretical ...

arXiv - AI · 4 min ·
[2602.14178] UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model
Llms

[2602.14178] UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

The paper presents UniWeTok, a unified binary tokenizer with a massive codebook size of 2^128, designed to enhance multimodal large langu...

arXiv - AI · 4 min ·
[2602.14157] When Test-Time Guidance Is Enough: Fast Image and Video Editing with Diffusion Guidance
Machine Learning

[2602.14157] When Test-Time Guidance Is Enough: Fast Image and Video Editing with Diffusion Guidance

The paper explores a novel approach to image and video editing using test-time guidance with diffusion models, achieving performance comp...

arXiv - AI · 3 min ·
[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing
Llms

[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing

This article presents a multi-agent framework for medical AI that enhances clinical query processing by leveraging fine-tuned language mo...

arXiv - AI · 4 min ·
[2602.13942] A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization
Llms

[2602.13942] A Theoretical Framework for LLM Fine-tuning Using Early Stopping for Non-random Initialization

This article presents a theoretical framework for fine-tuning large language models (LLMs) using early stopping and non-random initializa...

arXiv - Machine Learning · 3 min ·
[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models
Llms

[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models

This paper explores the integration of Large Language Models (LLMs) in anticipating adversary behavior within DevSecOps environments, pro...

arXiv - AI · 4 min ·
[2602.14080] Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality
Llms

[2602.14080] Empty Shelves or Lost Keys? Recall Is the Bottleneck for Parametric Factuality

The paper explores the limitations of factuality evaluations in large language models (LLMs), identifying recall as a key bottleneck in a...

arXiv - AI · 4 min ·
[2602.14043] Beyond Static Snapshots: Dynamic Modeling and Forecasting of Group-Level Value Evolution with Large Language Models
Llms

[2602.14043] Beyond Static Snapshots: Dynamic Modeling and Forecasting of Group-Level Value Evolution with Large Language Models

This article presents a novel framework for dynamic modeling and forecasting of group-level value evolution using large language models (...

arXiv - AI · 4 min ·
[2602.14041] BitDance: Scaling Autoregressive Generative Models with Binary Tokens
Machine Learning

[2602.14041] BitDance: Scaling Autoregressive Generative Models with Binary Tokens

BitDance introduces a novel autoregressive image generator that utilizes binary tokens for enhanced efficiency and performance in generat...

arXiv - AI · 4 min ·
[2602.13818] VAR-3D: View-aware Auto-Regressive Model for Text-to-3D Generation via a 3D Tokenizer
Machine Learning

[2602.13818] VAR-3D: View-aware Auto-Regressive Model for Text-to-3D Generation via a 3D Tokenizer

The VAR-3D model introduces a novel approach to text-to-3D generation, addressing challenges in discrete 3D representation and enhancing ...

arXiv - Machine Learning · 3 min ·
[2602.13954] Eureka-Audio: Triggering Audio Intelligence in Compact Language Models
Llms

[2602.13954] Eureka-Audio: Triggering Audio Intelligence in Compact Language Models

Eureka-Audio presents a compact audio language model that outperforms larger models in various audio understanding tasks, showcasing effi...

arXiv - AI · 4 min ·
[2602.13543] LiveNewsBench: Evaluating LLM Web Search Capabilities with Freshly Curated News
Llms

[2602.13543] LiveNewsBench: Evaluating LLM Web Search Capabilities with Freshly Curated News

The paper introduces LiveNewsBench, a benchmark for evaluating the web search capabilities of Large Language Models (LLMs) using freshly ...

arXiv - Machine Learning · 4 min ·
Previous Page 96 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime