AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

Every agent I build forgets everything between sessions. I got tired of it and built brainctl. pip install brainctl, then: from agentmemo...

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

Machine Learning

Why does Multi-Agent RL fail to act like a real society in Spatial Game Theory? [P] [R]

Hey everyone, I’m building a project for my university Machine Learning course called "Social network analysis using iterated game theory...

Reddit - Machine Learning · 1 min · about 17 hours ago

All Content

Llms

[2602.14968] PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement

The paper introduces PhyScensis, a framework that uses physics-augmented LLM agents to generate complex 3D physical scenes for robotic ma...

arXiv - AI · 4 min · about 2 months ago

Generative Ai

[2602.14941] AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

AnchorWeave introduces a novel framework for video generation that enhances spatial consistency over long durations by utilizing multiple...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.14917] BFS-PO: Best-First Search for Large Reasoning Models

The paper proposes BFS-PO, a new reinforcement learning algorithm that enhances the performance of Large Reasoning Models by reducing com...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2412.11439] Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

The paper presents a Bayesian flow network, specifically the ChemBFN model, which effectively generates out-of-distribution chemical samp...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.14881] Numerical exploration of the range of shape functionals using neural networks

This article presents a novel numerical framework for exploring shape functionals using neural networks, focusing on Blaschke–Santaló dia...

arXiv - AI · 3 min · about 2 months ago

Data Science

[2602.14879] CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography

CT-Bench introduces a benchmark dataset for multimodal lesion understanding in CT scans, featuring 20,335 lesions and a visual question a...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.14834] Debiasing Central Fixation Confounds Reveals a Peripheral "Sweet Spot" for Human-like Scanpaths in Hard-Attention Vision

This paper explores the impact of central fixation bias on evaluating human-like scanpaths in vision models, proposing a new metric to im...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2410.03919] Online Posterior Sampling with a Diffusion Prior

The paper presents algorithms for online posterior sampling in contextual bandits using a diffusion model prior, enhancing the efficiency...

arXiv - Machine Learning · 3 min · about 2 months ago

Generative Ai

[2602.14783] What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation

This article explores how cybercriminals are discussing and utilizing artificial intelligence (AI) to enhance their operations, revealing...

arXiv - AI · 3 min · about 2 months ago

Ai Startups

[2408.11438] Benchmarking AI-based data assimilation to advance data-driven global weather forecasting

This article presents DABench, a benchmark for evaluating AI-based data assimilation methods in global weather forecasting, demonstrating...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14778] A Geometric Analysis of Small-sized Language Model Hallucinations

This paper explores hallucinations in small-sized language models (LLMs) through a geometric lens, demonstrating that genuine responses c...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2407.11907] GraphFM: A generalist graph transformer that learns transferable representations across diverse domains

GraphFM introduces a scalable graph transformer that learns transferable representations across diverse domains, enhancing generalization...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14770] Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation

This study investigates how community discussions influence humor generation in large language models (LLMs), demonstrating that feedback...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2404.13895] Optimal Design for Human Preference Elicitation

The paper discusses optimal design strategies for eliciting human preferences, focusing on efficient methods for gathering high-quality f...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

This paper explores the efficiency of offline policy selection (OPS) in reinforcement learning, connecting it to off-policy evaluation (O...

arXiv - AI · 4 min · about 2 months ago

Robotics

[2602.14726] ManeuverNet: A Soft Actor-Critic Framework for Precise Maneuvering of Double-Ackermann-Steering Robots with Optimized Reward Functions

ManeuverNet introduces a Soft Actor-Critic framework for enhancing the maneuverability of double-Ackermann-steering robots, addressing li...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.15012] Cold-Start Personalization via Training-Free Priors from Structured World Models

This paper presents Pep, a novel approach for cold-start personalization that utilizes structured world models to improve user preference...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.15006] Distributed Quantum Gaussian Processes for Multi-Agent Systems

This article presents a novel Distributed Quantum Gaussian Process (DQGP) method for multi-agent systems, enhancing modeling capabilities...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14989] ThermEval: A Structured Benchmark for Evaluation of Vision-Language Models on Thermal Imagery

ThermEval introduces a benchmark for evaluating vision-language models on thermal imagery, highlighting their limitations in temperature-...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14681] ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies

The paper presents ST-EVO, a novel framework for generative spatio-temporal evolution of multi-agent communication topologies, enhancing ...

arXiv - AI · 3 min · about 2 months ago

Previous Page 134 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

persistent memory system for AI agents — single SQLite file, no external server, no API keys. free and opensource - BrainCTL

Why does Multi-Agent RL fail to act like a real society in Spatial Game Theory? [P] [R]

All Content

[2602.14968] PhyScensis: Physics-Augmented LLM Agents for Complex Physical Scene Arrangement

[2602.14941] AnchorWeave: World-Consistent Video Generation with Retrieved Local Spatial Memories

[2602.14917] BFS-PO: Best-First Search for Large Reasoning Models

[2412.11439] Bayesian Flow Is All You Need to Sample Out-of-Distribution Chemical Spaces

[2602.14881] Numerical exploration of the range of shape functionals using neural networks

[2602.14879] CT-Bench: A Benchmark for Multimodal Lesion Understanding in Computed Tomography

[2602.14834] Debiasing Central Fixation Confounds Reveals a Peripheral "Sweet Spot" for Human-like Scanpaths in Hard-Attention Vision

[2410.03919] Online Posterior Sampling with a Diffusion Prior

[2602.14783] What hackers talk about when they talk about AI: Early-stage diffusion of a cybercrime innovation

[2408.11438] Benchmarking AI-based data assimilation to advance data-driven global weather forecasting

[2602.14778] A Geometric Analysis of Small-sized Language Model Hallucinations

[2407.11907] GraphFM: A generalist graph transformer that learns transferable representations across diverse domains

[2602.14770] Multi-Agent Comedy Club: Investigating Community Discussion Effects on LLM Humor Generation

[2404.13895] Optimal Design for Human Preference Elicitation

[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

[2602.14726] ManeuverNet: A Soft Actor-Critic Framework for Precise Maneuvering of Double-Ackermann-Steering Robots with Optimized Reward Functions

[2602.15012] Cold-Start Personalization via Training-Free Priors from Structured World Models

[2602.15006] Distributed Quantum Gaussian Processes for Multi-Agent Systems

[2602.14989] ThermEval: A Structured Benchmark for Evaluation of Vision-Language Models on Thermal Imagery

[2602.14681] ST-EVO: Towards Generative Spatio-Temporal Evolution of Multi-Agent Communication Topologies

Related Topics

Stay updated with AI News