AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Auto agent - Self improving domain expertise agent

someone opensource an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the e...

Reddit - Artificial Intelligence · 1 min ·
Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money
Nlp

Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money

AI Tools & Products · 4 min ·

All Content

[2602.19594] ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?
Llms

[2602.19594] ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?

ISO-Bench introduces a benchmark for coding agents to optimize real-world inference workloads, evaluating their performance against exper...

arXiv - Machine Learning · 3 min ·
[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning
Machine Learning

[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning

This article presents the Advantage-based Adversarial Transformer (AAT), a novel method for generating time-correlated adversarial exampl...

arXiv - Machine Learning · 4 min ·
[2602.18571] Debug2Fix: Supercharging Coding Agents with Interactive Debugging Capabilities
Ai Agents

[2602.18571] Debug2Fix: Supercharging Coding Agents with Interactive Debugging Capabilities

The paper introduces Debug2Fix, a framework enhancing coding agents with interactive debugging capabilities, improving bug-fixing perform...

arXiv - AI · 4 min ·
[2602.19552] The Sample Complexity of Replicable Realizable PAC Learning
Machine Learning

[2602.19552] The Sample Complexity of Replicable Realizable PAC Learning

This paper explores the sample complexity of replicable realizable PAC learning, establishing a lower bound on sample complexity with nov...

arXiv - Machine Learning · 3 min ·
[2602.18551] From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark
Machine Learning

[2602.18551] From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark

This paper presents a novel approach to predicting operando infrared dynamics in lithium-ion batteries using a physics-informed flow mode...

arXiv - AI · 4 min ·
[2602.19533] Grokking Finite-Dimensional Algebra
Machine Learning

[2602.19533] Grokking Finite-Dimensional Algebra

This paper explores the grokking phenomenon in neural networks, focusing on learning multiplication in finite-dimensional algebras, exten...

arXiv - AI · 4 min ·
[2602.18548] 1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World
Data Science

[2602.18548] 1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World

The paper introduces 1D-Bench, a benchmark for evaluating iterative UI code generation with visual feedback, aimed at improving design-to...

arXiv - AI · 4 min ·
[2602.18540] Rodent-Bench
Llms

[2602.18540] Rodent-Bench

Rodent-Bench introduces a benchmark for evaluating Multimodal Large Language Models (MLLMs) in annotating rodent behavior videos, reveali...

arXiv - AI · 3 min ·
[2602.19455] SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning
Llms

[2602.19455] SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning

The paper introduces SenTSR-Bench, a framework that enhances time-series reasoning by integrating insights from specialized time-series l...

arXiv - AI · 4 min ·
[2602.18527] JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments
Llms

[2602.18527] JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments

The paper presents JAEGER, a framework for joint 3D audio-visual grounding and reasoning, addressing limitations of existing 2D models by...

arXiv - AI · 4 min ·
[2602.19419] RAmmStein: Regime Adaptation in Mean-reverting Markets with Stein Thresholds -- Optimal Impulse Control in Concentrated AMMs
Machine Learning

[2602.19419] RAmmStein: Regime Adaptation in Mean-reverting Markets with Stein Thresholds -- Optimal Impulse Control in Concentrated AMMs

The paper presents RAmmStein, a Deep Reinforcement Learning approach for optimal liquidity management in decentralized exchanges, focusin...

arXiv - Machine Learning · 4 min ·
[2602.18520] Sketch2Feedback: Grammar-in-the-Loop Framework for Rubric-Aligned Feedback on Student STEM Diagrams
Machine Learning

[2602.18520] Sketch2Feedback: Grammar-in-the-Loop Framework for Rubric-Aligned Feedback on Student STEM Diagrams

The paper presents Sketch2Feedback, a framework that enhances feedback on student-drawn STEM diagrams by integrating grammar rules to red...

arXiv - AI · 4 min ·
[2602.19414] Federated Causal Representation Learning in State-Space Systems for Decentralized Counterfactual Reasoning
Machine Learning

[2602.19414] Federated Causal Representation Learning in State-Space Systems for Decentralized Counterfactual Reasoning

This paper presents a federated framework for causal representation learning in state-space systems, enabling decentralized counterfactua...

arXiv - Machine Learning · 3 min ·
[2602.19406] LEVDA: Latent Ensemble Variational Data Assimilation via Differentiable Dynamics
Machine Learning

[2602.19406] LEVDA: Latent Ensemble Variational Data Assimilation via Differentiable Dynamics

The paper presents LEVDA, a novel ensemble-space variational smoother for geophysical forecasting that improves data assimilation by oper...

arXiv - Machine Learning · 3 min ·
[2602.18504] A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage
Computer Vision

[2602.18504] A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage

This paper presents a computer vision framework for detecting and tracking players and the ball in soccer broadcast footage using a singl...

arXiv - AI · 3 min ·
[2602.18497] PIPE-RDF: An LLM-Assisted Pipeline for Enterprise RDF Benchmarking
Llms

[2602.18497] PIPE-RDF: An LLM-Assisted Pipeline for Enterprise RDF Benchmarking

PIPE-RDF presents a novel pipeline for generating schema-specific NL-SPARQL benchmarks, enhancing RDF knowledge graph querying for enterp...

arXiv - AI · 3 min ·
[2602.19392] Spiking Graph Predictive Coding for Reliable OOD Generalization
Machine Learning

[2602.19392] Spiking Graph Predictive Coding for Reliable OOD Generalization

The paper introduces Spiking Graph Predictive Coding (SIGHT), a novel approach to enhance out-of-distribution (OOD) generalization in gra...

arXiv - Machine Learning · 3 min ·
[2602.19373] Stable Deep Reinforcement Learning via Isotropic Gaussian Representations
Machine Learning

[2602.19373] Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

This paper presents a method for enhancing stability in deep reinforcement learning by utilizing isotropic Gaussian representations, addr...

arXiv - AI · 3 min ·
[2602.19362] LLMs Can Learn to Reason Via Off-Policy RL
Llms

[2602.19362] LLMs Can Learn to Reason Via Off-Policy RL

The paper presents a novel off-policy reinforcement learning algorithm, OAPL, for Large Language Models (LLMs) that enhances reasoning ca...

arXiv - Machine Learning · 4 min ·
[2602.18495] RDBLearn: Simple In-Context Prediction Over Relational Databases
Machine Learning

[2602.18495] RDBLearn: Simple In-Context Prediction Over Relational Databases

RDBLearn introduces a novel approach for in-context learning (ICL) in relational databases, enabling efficient prediction tasks without e...

arXiv - Machine Learning · 3 min ·
Previous Page 78 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime