AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

Auto agent - Self improving domain expertise agent

someone opensource an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the e...

Reddit - Artificial Intelligence · 1 min · about 10 hours ago

Nlp

Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money

AI Tools & Products · 4 min · about 15 hours ago

All Content

Llms

[2602.19594] ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?

ISO-Bench introduces a benchmark for coding agents to optimize real-world inference workloads, evaluating their performance against exper...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning

This article presents the Advantage-based Adversarial Transformer (AAT), a novel method for generating time-correlated adversarial exampl...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Agents

[2602.18571] Debug2Fix: Supercharging Coding Agents with Interactive Debugging Capabilities

The paper introduces Debug2Fix, a framework enhancing coding agents with interactive debugging capabilities, improving bug-fixing perform...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19552] The Sample Complexity of Replicable Realizable PAC Learning

This paper explores the sample complexity of replicable realizable PAC learning, establishing a lower bound on sample complexity with nov...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.18551] From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark

This paper presents a novel approach to predicting operando infrared dynamics in lithium-ion batteries using a physics-informed flow mode...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19533] Grokking Finite-Dimensional Algebra

This paper explores the grokking phenomenon in neural networks, focusing on learning multiplication in finite-dimensional algebras, exten...

arXiv - AI · 4 min · about 1 month ago

Data Science

[2602.18548] 1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World

The paper introduces 1D-Bench, a benchmark for evaluating iterative UI code generation with visual feedback, aimed at improving design-to...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.18540] Rodent-Bench

Rodent-Bench introduces a benchmark for evaluating Multimodal Large Language Models (MLLMs) in annotating rodent behavior videos, reveali...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.19455] SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning

The paper introduces SenTSR-Bench, a framework that enhances time-series reasoning by integrating insights from specialized time-series l...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.18527] JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments

The paper presents JAEGER, a framework for joint 3D audio-visual grounding and reasoning, addressing limitations of existing 2D models by...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19419] RAmmStein: Regime Adaptation in Mean-reverting Markets with Stein Thresholds -- Optimal Impulse Control in Concentrated AMMs

The paper presents RAmmStein, a Deep Reinforcement Learning approach for optimal liquidity management in decentralized exchanges, focusin...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18520] Sketch2Feedback: Grammar-in-the-Loop Framework for Rubric-Aligned Feedback on Student STEM Diagrams

The paper presents Sketch2Feedback, a framework that enhances feedback on student-drawn STEM diagrams by integrating grammar rules to red...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.19414] Federated Causal Representation Learning in State-Space Systems for Decentralized Counterfactual Reasoning

This paper presents a federated framework for causal representation learning in state-space systems, enabling decentralized counterfactua...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.19406] LEVDA: Latent Ensemble Variational Data Assimilation via Differentiable Dynamics

The paper presents LEVDA, a novel ensemble-space variational smoother for geophysical forecasting that improves data assimilation by oper...

arXiv - Machine Learning · 3 min · about 1 month ago

Computer Vision

[2602.18504] A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage

This paper presents a computer vision framework for detecting and tracking players and the ball in soccer broadcast footage using a singl...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.18497] PIPE-RDF: An LLM-Assisted Pipeline for Enterprise RDF Benchmarking

PIPE-RDF presents a novel pipeline for generating schema-specific NL-SPARQL benchmarks, enhancing RDF knowledge graph querying for enterp...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.19392] Spiking Graph Predictive Coding for Reliable OOD Generalization

The paper introduces Spiking Graph Predictive Coding (SIGHT), a novel approach to enhance out-of-distribution (OOD) generalization in gra...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.19373] Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

This paper presents a method for enhancing stability in deep reinforcement learning by utilizing isotropic Gaussian representations, addr...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.19362] LLMs Can Learn to Reason Via Off-Policy RL

The paper presents a novel off-policy reinforcement learning algorithm, OAPL, for Large Language Models (LLMs) that enhances reasoning ca...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.18495] RDBLearn: Simple In-Context Prediction Over Relational Databases

RDBLearn introduces a novel approach for in-context learning (ICL) in relational databases, enabling efficient prediction tasks without e...

arXiv - Machine Learning · 3 min · about 1 month ago

Previous Page 78 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

OpenClaw security checklist: practical safeguards for AI agents

Auto agent - Self improving domain expertise agent

Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money

All Content

[2602.19594] ISO-Bench: Can Coding Agents Optimize Real-World Inference Workloads?

[2602.19582] Advantage-based Temporal Attack in Reinforcement Learning

[2602.18571] Debug2Fix: Supercharging Coding Agents with Interactive Debugging Capabilities

[2602.19552] The Sample Complexity of Replicable Realizable PAC Learning

[2602.18551] From Static Spectra to Operando Infrared Dynamics: Physics Informed Flow Modeling and a Benchmark

[2602.19533] Grokking Finite-Dimensional Algebra

[2602.18548] 1D-Bench: A Benchmark for Iterative UI Code Generation with Visual Feedback in Real-World

[2602.18540] Rodent-Bench

[2602.19455] SenTSR-Bench: Thinking with Injected Knowledge for Time-Series Reasoning

[2602.18527] JAEGER: Joint 3D Audio-Visual Grounding and Reasoning in Simulated Physical Environments

[2602.19419] RAmmStein: Regime Adaptation in Mean-reverting Markets with Stein Thresholds -- Optimal Impulse Control in Concentrated AMMs

[2602.18520] Sketch2Feedback: Grammar-in-the-Loop Framework for Rubric-Aligned Feedback on Student STEM Diagrams

[2602.19414] Federated Causal Representation Learning in State-Space Systems for Decentralized Counterfactual Reasoning

[2602.19406] LEVDA: Latent Ensemble Variational Data Assimilation via Differentiable Dynamics

[2602.18504] A Computer Vision Framework for Multi-Class Detection and Tracking in Soccer Broadcast Footage

[2602.18497] PIPE-RDF: An LLM-Assisted Pipeline for Enterprise RDF Benchmarking

[2602.19392] Spiking Graph Predictive Coding for Reliable OOD Generalization

[2602.19373] Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

[2602.19362] LLMs Can Learn to Reason Via Off-Policy RL

[2602.18495] RDBLearn: Simple In-Context Prediction Over Relational Databases

Related Topics

Stay updated with AI News