AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

Claude code x n8n

Hi everyone, I’ve been exploring MCP and integrating tools like n8n with Claude Code, and I’m trying to understand how practical this rea...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Ai Agents

Cloudflare just turned Browser Rendering into a lot more powerful MCP infrastructure

Browser Rendering now exposes the Chrome DevTools Protocol, which means MCP clients can access a remote browser directly. That’s a pretty...

Reddit - Artificial Intelligence · 1 min · about 10 hours ago

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 17 hours ago

All Content

Machine Learning

[2602.14795] Return of the Schema: Building Complete Datasets for Machine Learning and Reasoning on Knowledge Graphs

This paper presents a novel resource for building complete datasets that integrate schema and ground facts for machine learning and reaso...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.14740] AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises

The paper explores how advanced AI models exhibit complex reasoning in simulated nuclear crises, revealing insights into strategic decisi...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14050] Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers

This paper introduces a novel position encoding strategy, Random Float Sampling (RFS), which enhances the length generalization capabilit...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14721] WebWorld: A Large-Scale World Model for Web Agent Training

WebWorld introduces a large-scale simulator for training web agents, utilizing over 1 million open-web interactions to enhance generaliza...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.14697] Evolutionary System Prompt Learning can Facilitate Reinforcement Learning for LLMs

The paper proposes Evolutionary System Prompt Learning (E-SPL) to enhance reinforcement learning in large language models (LLMs) by evolv...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14017] S2SServiceBench: A Multimodal Benchmark for Last-Mile S2S Climate Services

The paper presents S2SServiceBench, a multimodal benchmark designed to enhance the effectiveness of last-mile subseasonal-to-seasonal (S2...

arXiv - Machine Learning · 4 min · about 2 months ago

Robotics

[2602.14691] Removing Planner Bias in Goal Recognition Through Multi-Plan Dataset Generation

This paper presents a method to eliminate planner bias in goal recognition using multi-plan dataset generation, enhancing the evaluation ...

arXiv - AI · 3 min · about 2 months ago

Ai Agents

[2602.14674] From User Preferences to Base Score Extraction Functions in Gradual Argumentation

This paper introduces Base Score Extraction Functions in gradual argumentation, enhancing decision-making and AI transparency by mapping ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14643] Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

The paper presents Arbor, a framework designed to enhance the navigation of critical conversation flows in high-stakes environments like ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13958] Chemical Language Models for Natural Products: A State-Space Model Approach

This article presents a novel approach to chemical language models specifically for natural products, showcasing the effectiveness of sta...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.13953] QuRL: Efficient Reinforcement Learning with Quantized Rollout

The paper introduces Quantized Reinforcement Learning (QuRL), a method aimed at improving the efficiency of reinforcement learning in lar...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.13949] Experiential Reinforcement Learning

The paper introduces Experiential Reinforcement Learning (ERL), a new paradigm that enhances learning efficiency in language models by in...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.14589] MATEO: A Multimodal Benchmark for Temporal Reasoning and Planning in LVLMs

MATEO introduces a benchmark for assessing temporal reasoning in Large Vision Language Models (LVLMs), focusing on multimodal inputs and ...

arXiv - Machine Learning · 3 min · about 2 months ago

Llms

[2602.13937] A Multi-Agent Framework for Code-Guided, Modular, and Verifiable Automated Machine Learning

The paper presents iML, a multi-agent framework for automated machine learning that enhances transparency and modularity, addressing limi...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14457] Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5

This technical report presents a comprehensive risk analysis framework for frontier AI, focusing on emerging threats and mitigation strat...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.14404] Boule or Baguette? A Study on Task Topology, Length Generalization, and the Benefit of Reasoning Traces

This study explores the efficacy of reasoning traces in neural networks, introducing a large dataset to assess how well models generalize...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.13813] Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference

Pawsterior introduces a variational flow-matching framework to enhance simulation-based inference (SBI), addressing constraints in struct...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14296] AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

The paper presents AutoWebWorld, a framework that synthesizes verifiable web environments using Finite State Machines, enhancing the trai...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13810] Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

The paper introduces the Mean Velocity Policy (MVP) for reinforcement learning, which enhances one-step action generation by modeling the...

arXiv - AI · 3 min · about 2 months ago

Nlp

[2602.14252] GRAIL: Goal Recognition Alignment through Imitation Learning

The paper introduces GRAIL, a method for recognizing agent goals through imitation learning, enhancing goal recognition accuracy in AI sy...

arXiv - Machine Learning · 3 min · about 2 months ago

Previous Page 142 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

Claude code x n8n

Cloudflare just turned Browser Rendering into a lot more powerful MCP infrastructure

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

All Content

[2602.14795] Return of the Schema: Building Complete Datasets for Machine Learning and Reasoning on Knowledge Graphs

[2602.14740] AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises

[2602.14050] Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers

[2602.14721] WebWorld: A Large-Scale World Model for Web Agent Training

[2602.14697] Evolutionary System Prompt Learning can Facilitate Reinforcement Learning for LLMs

[2602.14017] S2SServiceBench: A Multimodal Benchmark for Last-Mile S2S Climate Services

[2602.14691] Removing Planner Bias in Goal Recognition Through Multi-Plan Dataset Generation

[2602.14674] From User Preferences to Base Score Extraction Functions in Gradual Argumentation

[2602.14643] Arbor: A Framework for Reliable Navigation of Critical Conversation Flows

[2602.13958] Chemical Language Models for Natural Products: A State-Space Model Approach

[2602.13953] QuRL: Efficient Reinforcement Learning with Quantized Rollout

[2602.13949] Experiential Reinforcement Learning

[2602.14589] MATEO: A Multimodal Benchmark for Temporal Reasoning and Planning in LVLMs

[2602.13937] A Multi-Agent Framework for Code-Guided, Modular, and Verifiable Automated Machine Learning

[2602.14457] Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report v1.5

[2602.14404] Boule or Baguette? A Study on Task Topology, Length Generalization, and the Benefit of Reasoning Traces

[2602.13813] Pawsterior: Variational Flow Matching for Structured Simulation-Based Inference

[2602.14296] AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

[2602.13810] Mean Flow Policy with Instantaneous Velocity Constraint for One-step Action Generation

[2602.14252] GRAIL: Goal Recognition Alignment through Imitation Learning

Related Topics

Stay updated with AI News