Generative AI

Image, video, audio, and text generation

Top This Week

Machine Learning

AI video generation seems fundamentally more expensive than text, not just less optimized

There’s been a lot of discussion recently about how expensive AI video generation is compared to text, and it feels like this is more tha...

Reddit - Artificial Intelligence · 1 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion
Machine Learning

[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

Abstract page for arXiv paper 2603.10202: Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Ap...

arXiv - Machine Learning · 4 min ·

All Content

[2509.26209] Diversity-Incentivized Exploration for Versatile Reasoning
Llms

[2509.26209] Diversity-Incentivized Exploration for Versatile Reasoning

The paper presents DIVER, a framework for enhancing reasoning in Large Language Models through diversity-incentivized exploration, addres...

arXiv - AI · 4 min ·
[2509.21896] GenesisGeo: Technical Report
Llms

[2509.21896] GenesisGeo: Technical Report

GenesisGeo presents a novel approach to geometric reasoning by introducing a large-scale dataset and a multi-task training paradigm that ...

arXiv - AI · 3 min ·
[2506.08604] Physics vs Distributions: Pareto Optimal Flow Matching with Physics Constraints
Machine Learning

[2506.08604] Physics vs Distributions: Pareto Optimal Flow Matching with Physics Constraints

This article presents a novel method, Physics-Based Flow Matching (PBFM), which integrates physical constraints into generative modeling,...

arXiv - AI · 4 min ·
[2506.08119] SOP-Bench: Complex Industrial SOPs for Evaluating LLM Agents
Llms

[2506.08119] SOP-Bench: Complex Industrial SOPs for Evaluating LLM Agents

SOP-Bench introduces a benchmark for evaluating LLM agents on complex industrial SOPs, featuring over 2,000 tasks across various domains,...

arXiv - AI · 4 min ·
[2506.00486] It Takes a Good Model to Train a Good Model: Generalized Gaussian Priors for Optimized LLMs
Llms

[2506.00486] It Takes a Good Model to Train a Good Model: Generalized Gaussian Priors for Optimized LLMs

This paper presents a novel optimization framework for large language models (LLMs) based on generalized Gaussian distributions, enhancin...

arXiv - AI · 4 min ·
[2505.24183] QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation
Llms

[2505.24183] QiMeng-CodeV-R1: Reasoning-Enhanced Verilog Generation

The paper introduces QiMeng-CodeV-R1, a framework for reasoning-enhanced Verilog generation using reinforcement learning with verifiable ...

arXiv - Machine Learning · 4 min ·
[2504.05806] Meta-Continual Learning of Neural Fields
Machine Learning

[2504.05806] Meta-Continual Learning of Neural Fields

The paper introduces Meta-Continual Learning of Neural Fields (MCL-NF), a novel approach that enhances the efficiency and quality of neur...

arXiv - AI · 3 min ·
[2505.17517] The Spacetime of Diffusion Models: An Information Geometry Perspective
Machine Learning

[2505.17517] The Spacetime of Diffusion Models: An Information Geometry Perspective

This paper presents a novel geometric perspective on diffusion models, revealing flaws in traditional decoding methods and proposing a ne...

arXiv - Machine Learning · 4 min ·
[2502.13069] Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering
Llms

[2502.13069] Ambig-SWE: Interactive Agents to Overcome Underspecificity in Software Engineering

The paper introduces Ambig-SWE, a framework for evaluating AI agents' ability to handle underspecified instructions in software engineeri...

arXiv - AI · 4 min ·
[2505.08783] CodePDE: An Inference Framework for LLM-driven PDE Solver Generation
Llms

[2505.08783] CodePDE: An Inference Framework for LLM-driven PDE Solver Generation

The article presents CodePDE, an innovative framework leveraging large language models (LLMs) for generating solvers for partial differen...

arXiv - AI · 4 min ·
[2305.11098] A Simple Generative Model of Logical Reasoning and Statistical Learning
Machine Learning

[2305.11098] A Simple Generative Model of Logical Reasoning and Statistical Learning

This paper presents a Bayesian model that unifies logical reasoning and statistical learning, proposing a framework for human-like machin...

arXiv - AI · 4 min ·
[2502.11034] AdaGC: Improving Training Stability for Large Language Model Pretraining
Llms

[2502.11034] AdaGC: Improving Training Stability for Large Language Model Pretraining

The paper presents AdaGC, a novel adaptive gradient clipping method aimed at enhancing training stability in large language model pretrai...

arXiv - Machine Learning · 4 min ·
[2503.11842] Test-Time Training Provably Improves Transformers as In-context Learners
Llms

[2503.11842] Test-Time Training Provably Improves Transformers as In-context Learners

The paper explores how Test-Time Training (TTT) enhances transformer models as in-context learners, demonstrating significant efficiency ...

arXiv - Machine Learning · 4 min ·
[2602.20130] To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering
Llms

[2602.20130] To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering

The paper presents Selective Chain-of-Thought (Selective CoT), a method to enhance medical question answering efficiency using large lang...

arXiv - AI · 4 min ·
[2502.05795] The Curse of Depth in Large Language Models
Llms

[2502.05795] The Curse of Depth in Large Language Models

This paper introduces the 'Curse of Depth' in Large Language Models (LLMs), revealing that many deep layers are ineffective due to Pre-La...

arXiv - AI · 4 min ·
[2602.20119] NovaPlan: Zero-Shot Long-Horizon Manipulation via Closed-Loop Video Language Planning
Llms

[2602.20119] NovaPlan: Zero-Shot Long-Horizon Manipulation via Closed-Loop Video Language Planning

NovaPlan introduces a framework for zero-shot long-horizon manipulation in robotics, integrating video language planning with geometrical...

arXiv - AI · 4 min ·
[2602.20113] StyleStream: Real-Time Zero-Shot Voice Style Conversion
Machine Learning

[2602.20113] StyleStream: Real-Time Zero-Shot Voice Style Conversion

StyleStream introduces a novel real-time zero-shot voice style conversion system that enhances voice synthesis by disentangling linguisti...

arXiv - AI · 3 min ·
[2502.03771] vCache: Verified Semantic Prompt Caching
Llms

[2502.03771] vCache: Verified Semantic Prompt Caching

The paper presents vCache, a verified semantic prompt caching system that enhances LLM inference efficiency by dynamically adjusting simi...

arXiv - Machine Learning · 4 min ·
[2602.20065] Multilingual Large Language Models do not comprehend all natural languages to equal degrees
Llms

[2602.20065] Multilingual Large Language Models do not comprehend all natural languages to equal degrees

This article examines the performance of multilingual large language models (LLMs) across various languages, revealing that comprehension...

arXiv - AI · 4 min ·
[2602.20064] The LLMbda Calculus: AI Agents, Conversations, and Information Flow
Llms

[2602.20064] The LLMbda Calculus: AI Agents, Conversations, and Information Flow

The LLMbda Calculus introduces a formal framework for understanding AI agents' conversations, addressing vulnerabilities like prompt inje...

arXiv - AI · 4 min ·
Previous Page 55 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime