AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Ai Agents

[P] Easily provide Wandb logs as context to agents for analysis and planning.

It is frustrating to use the Wandb CLI and MCP tools with my agents. For one, the MCP tool basically floods the context window and freque...

Reddit - Machine Learning · 1 min ·
Deepmind's 'AI Agent Traps' Paper Maps How Hackers Could Weaponize AI Agents Against Users
Ai Agents

Deepmind's 'AI Agent Traps' Paper Maps How Hackers Could Weaponize AI Agents Against Users

AI Tools & Products · 7 min ·
Llms

Claude, OpenClaw and the new reality: AI agents are here — and so is the chaos

AI Tools & Products ·

All Content

[2602.17709] UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems
Llms

[2602.17709] UBio-MolFM: A Universal Molecular Foundation Model for Bio-Systems

UBio-MolFM presents a universal molecular foundation model designed to enhance all-atom molecular simulations, bridging the gap between q...

arXiv - AI · 4 min ·
[2602.18141] Advection-Diffusion on Graphs: A Bakry-Emery Laplacian for Spectral Graph Neural Networks
Machine Learning

[2602.18141] Advection-Diffusion on Graphs: A Bakry-Emery Laplacian for Spectral Graph Neural Networks

The paper introduces a Bakry-Emery Laplacian for Graph Neural Networks (GNNs), enhancing information propagation without altering graph s...

arXiv - Machine Learning · 3 min ·
[2602.18117] Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning
Machine Learning

[2602.18117] Flow Matching with Injected Noise for Offline-to-Online Reinforcement Learning

The paper presents Flow Matching with Injected Noise (FINO), a novel method enhancing offline-to-online reinforcement learning by improvi...

arXiv - Machine Learning · 3 min ·
[2602.18109] TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler for Adaptive Deadline-Centric Real-Time Dispatchs
Machine Learning

[2602.18109] TempoNet: Slack-Quantized Transformer-Guided Reinforcement Scheduler for Adaptive Deadline-Centric Real-Time Dispatchs

TempoNet introduces a novel reinforcement learning scheduler that utilizes a transformer architecture for efficient real-time task dispat...

arXiv - Machine Learning · 4 min ·
[2602.18060] Deepmechanics
Machine Learning

[2602.18060] Deepmechanics

The paper 'Deepmechanics' benchmarks physics-informed deep learning models for dynamical systems, revealing challenges in stability for c...

arXiv - Machine Learning · 3 min ·
[2602.17675] Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts
Llms

[2602.17675] Mind the Boundary: Stabilizing Gemini Enterprise A2A via a Cloud Run Hub Across Projects and Accounts

The article discusses the implementation of a Cloud Run Hub for stabilizing Gemini Enterprise A2A interactions across multiple projects a...

arXiv - AI · 4 min ·
[2602.18037] Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards
Llms

[2602.18037] Gradient Regularization Prevents Reward Hacking in Reinforcement Learning from Human Feedback and Verifiable Rewards

This paper presents a novel approach to prevent reward hacking in reinforcement learning by using gradient regularization, enhancing the ...

arXiv - Machine Learning · 4 min ·
[2602.18015] Flow Actor-Critic for Offline Reinforcement Learning
Machine Learning

[2602.18015] Flow Actor-Critic for Offline Reinforcement Learning

The paper introduces Flow Actor-Critic, a novel method for offline reinforcement learning that utilizes flow policies to manage complex, ...

arXiv - Machine Learning · 3 min ·
[2602.18008] NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs
Llms

[2602.18008] NIMMGen: Learning Neural-Integrated Mechanistic Digital Twins with LLMs

The paper introduces NIMMGen, a framework for learning neural-integrated mechanistic models using large language models (LLMs), addressin...

arXiv - Machine Learning · 3 min ·
[2602.07152] Trojans in Artificial Intelligence (TrojAI) Final Report
Machine Learning

[2602.07152] Trojans in Artificial Intelligence (TrojAI) Final Report

The Trojans in Artificial Intelligence (TrojAI) Final Report outlines the findings of a multi-year initiative aimed at addressing vulnera...

arXiv - Machine Learning · 4 min ·
[2602.18291] Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies
Machine Learning

[2602.18291] Diffusing to Coordinate: Efficient Online Multi-Agent Diffusion Policies

This paper introduces OMAD, an innovative Online Multi-Agent Reinforcement Learning framework utilizing diffusion policies to enhance coo...

arXiv - AI · 3 min ·
[2602.17998] PHAST: Port-Hamiltonian Architecture for Structured Temporal Dynamics Forecasting
Machine Learning

[2602.17998] PHAST: Port-Hamiltonian Architecture for Structured Temporal Dynamics Forecasting

The paper presents PHAST, a Port-Hamiltonian architecture designed for forecasting dynamics in physical systems using only position data,...

arXiv - Machine Learning · 4 min ·
[2602.18095] Neurosymbolic Language Reasoning as Satisfiability Modulo Theory
Llms

[2602.18095] Neurosymbolic Language Reasoning as Satisfiability Modulo Theory

This article presents Logitext, a neurosymbolic language that enhances natural language understanding by integrating large language model...

arXiv - AI · 3 min ·
[2602.17993] Turbo Connection: Reasoning as Information Flow from Higher to Lower Layers
Machine Learning

[2602.17993] Turbo Connection: Reasoning as Information Flow from Higher to Lower Layers

The paper introduces Turbo Connection, a novel architecture that enhances reasoning in Transformers by allowing multiple residual connect...

arXiv - Machine Learning · 4 min ·
[2602.18025] Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets
Machine Learning

[2602.18025] Cross-Embodiment Offline Reinforcement Learning for Heterogeneous Robot Datasets

This article presents a novel approach to offline reinforcement learning by integrating cross-embodiment learning to enhance robot policy...

arXiv - AI · 3 min ·
[2602.17978] Learning Optimal and Sample-Efficient Decision Policies with Guarantees
Machine Learning

[2602.17978] Learning Optimal and Sample-Efficient Decision Policies with Guarantees

This paper presents a novel approach to learning optimal and sample-efficient decision policies in reinforcement learning, addressing cha...

arXiv - Machine Learning · 4 min ·
[2602.17990] WorkflowPerturb: Calibrated Stress Tests for Evaluating Multi-Agent Workflow Metrics
Llms

[2602.17990] WorkflowPerturb: Calibrated Stress Tests for Evaluating Multi-Agent Workflow Metrics

The paper introduces WorkflowPerturb, a benchmark for evaluating multi-agent workflow metrics through calibrated stress tests, addressing...

arXiv - AI · 3 min ·
[2602.17985] Learning Without Training
Machine Learning

[2602.17985] Learning Without Training

This paper explores innovative methods in machine learning, addressing supervised learning, transfer learning, and classification through...

arXiv - Machine Learning · 4 min ·
[2602.17976] In-Context Learning for Pure Exploration in Continuous Spaces
Machine Learning

[2602.17976] In-Context Learning for Pure Exploration in Continuous Spaces

The paper presents C-ICPE-TS, a novel algorithm for pure exploration in continuous spaces, enhancing adaptive learning strategies in mach...

arXiv - Machine Learning · 4 min ·
[2602.17910] Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems
Machine Learning

[2602.17910] Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems

This paper presents APEMO, a novel runtime scheduling layer designed to enhance the reliability of long-horizon agentic systems by optimi...

arXiv - AI · 3 min ·
Previous Page 91 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime