AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Is the Mirage Effect a bug, or is it Geometric Reconstruction in action? A framework for why VLMs perform better "hallucinating" than guessing, and what that may tell us about what's really inside these models

Last week, a team from Stanford and UCSF (Asadi, O'Sullivan, Fei-Fei Li, Euan Ashley et al.) dropped two companion papers. The first, MAR...

Reddit - Artificial Intelligence · 1 min · 36 minutes ago

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min · about 10 hours ago

Robotics

What happens when AI agents can earn and spend real money? I built a small test to find out

I've been sitting with a question for a while: what happens when AI agents aren't just tools to be used, but participants in an economy? ...

Reddit - Artificial Intelligence · 1 min · about 12 hours ago

All Content

Machine Learning

[2602.21340] HiPPO Zoo: Explicit Memory Mechanisms for Interpretable State Space Models

The paper introduces the HiPPO Zoo, a framework enhancing state space models with explicit memory mechanisms for improved interpretabilit...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21328] Efficient Opportunistic Approachability

This paper presents an efficient algorithm for opportunistic approachability, improving upon previous methods by achieving faster approac...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.21320] Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

The paper presents Tool-R0, a framework for training self-evolving LLM agents capable of tool-learning without prior data, showcasing sig...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.21319] Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling

The paper presents cVMDx, an advanced diffusion model for multimodal highway trajectory prediction, enhancing efficiency and accuracy in ...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.21297] Robust AI Evaluation through Maximal Lotteries

The paper proposes a new method for evaluating AI models using robust lotteries, addressing limitations of traditional pairwise compariso...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Safety

[2602.05066] Bypassing AI Control Protocols via Agent-as-a-Proxy Attacks

The paper discusses vulnerabilities in AI control protocols, specifically how Agent-as-a-Proxy attacks can bypass existing monitoring def...

arXiv - AI · 3 min · about 1 month ago

Nlp

[2602.02007] Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

The paper introduces xMemory, a novel approach to agent memory systems that enhances retrieval by decoupling and aggregating semantic com...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.00462] LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

The paper introduces LatentLens, a method for mapping visual tokens to natural language descriptions in Vision-Language Models (VLMs), en...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.00012] OGD4All: A Framework for Accessible Interaction with Geospatial Open Government Data Based on Large Language Models

The OGD4All framework enhances citizen interaction with geospatial Open Government Data using Large Language Models, achieving high accur...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Agents

[2601.15715] RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind

The paper presents RebuttalAgent, a framework using Theory of Mind for strategic persuasion in academic rebuttals, addressing the complex...

arXiv - AI · 4 min · about 1 month ago

Computer Vision

[2601.08026] FigEx2: Visual-Conditioned Panel Detection and Captioning for Scientific Compound Figures

The paper presents FigEx2, a framework for detecting and captioning panels in scientific compound figures, enhancing understanding and ac...

arXiv - AI · 4 min · about 1 month ago

Ai Safety

[2512.17989] The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective

This article explores the gaps in understanding superintelligence misalignment, emphasizing the absence of the human subject and the impl...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

This article presents a unified framework for Aerial Vision-Language Navigation (VLN), enabling UAVs to interpret natural language and na...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2512.09069] KD-OCT: Efficient Knowledge Distillation for Clinical-Grade Retinal OCT Classification

The paper presents KD-OCT, a novel knowledge distillation framework that enhances the efficiency of deep learning models for classifying ...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.20718] Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization

This article presents SORL, a novel approach to stabilize off-policy training for long-horizon LLM agents, addressing issues of instabili...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2511.01734] A Proof of Learning Rate Transfer under $μ$P

This paper presents a proof of learning rate transfer in linear multi-layer perceptrons (MLPs) using a new parameterization method called...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2511.00062] World Simulation with Video Foundation Models for Physical AI

The paper presents Cosmos-Predict2.5, an advanced model for world simulation in Physical AI, integrating various generation methods and i...

arXiv - Machine Learning · 5 min · about 1 month ago

Machine Learning

[2510.18060] SPACeR: Self-Play Anchoring with Centralized Reference Models

The paper introduces SPACeR, a framework for enhancing autonomous vehicle behavior through self-play reinforcement learning anchored by a...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.10472] FML-bench: Benchmarking Machine Learning Agents for Scientific Research

The paper introduces FML-bench, a new benchmark for evaluating machine learning agents in scientific research, focusing on exploration di...

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.05077] Slm-mux: Orchestrating small language models for reasoning

The paper presents SLM-MUX, a novel architecture for orchestrating small language models (SLMs) to improve reasoning accuracy, achieving ...

arXiv - AI · 4 min · about 1 month ago

Previous Page 45 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

Is the Mirage Effect a bug, or is it Geometric Reconstruction in action? A framework for why VLMs perform better "hallucinating" than guessing, and what that may tell us about what's really inside these models

What I learned about multi-agent coordination running 9 specialized Claude agents

What happens when AI agents can earn and spend real money? I built a small test to find out

All Content

[2602.21340] HiPPO Zoo: Explicit Memory Mechanisms for Interpretable State Space Models

[2602.21328] Efficient Opportunistic Approachability

[2602.21320] Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

[2602.21319] Uncertainty-Aware Diffusion Model for Multimodal Highway Trajectory Prediction via DDIM Sampling

[2602.21297] Robust AI Evaluation through Maximal Lotteries

[2602.05066] Bypassing AI Control Protocols via Agent-as-a-Proxy Attacks

[2602.02007] Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation

[2602.00462] LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

[2602.00012] OGD4All: A Framework for Accessible Interaction with Geospatial Open Government Data Based on Large Language Models

[2601.15715] RebuttalAgent: Strategic Persuasion in Academic Rebuttal via Theory of Mind

[2601.08026] FigEx2: Visual-Conditioned Panel Detection and Captioning for Scientific Compound Figures

[2512.17989] The Subject of Emergent Misalignment in Superintelligence: An Anthropological, Cognitive Neuropsychological, Machine-Learning, and Ontological Perspective

[2512.08639] Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

[2512.09069] KD-OCT: Efficient Knowledge Distillation for Clinical-Grade Retinal OCT Classification

[2511.20718] Stabilizing Off-Policy Training for Long-Horizon LLM Agent via Turn-Level Importance Sampling and Clipping-Triggered Normalization

[2511.01734] A Proof of Learning Rate Transfer under $μ$P

[2511.00062] World Simulation with Video Foundation Models for Physical AI

[2510.18060] SPACeR: Self-Play Anchoring with Centralized Reference Models

[2510.10472] FML-bench: Benchmarking Machine Learning Agents for Scientific Research

[2510.05077] Slm-mux: Orchestrating small language models for reasoning

Related Topics

Stay updated with AI News