AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

OpenClaw gives users yet another reason to be freaked out about security - Ars Technica
Ai Agents

OpenClaw gives users yet another reason to be freaked out about security - Ars Technica

The viral AI agentic tool let attackers silently gain admin unauthenticated access.

Ars Technica - AI · 5 min ·
Robotics

What happens when you let AI agents run a sitcom 24/7 with zero human involvement

Ran an experiment — gave AI agents full control over writing, character creation, and performing a sitcom. Left it running nonstop for ov...

Reddit - Artificial Intelligence · 1 min ·
Ai Agents

Microsoft's newest open-source project: Runtime security for AI agents

submitted by /u/Fcking_Chuck [link] [comments]

Reddit - Artificial Intelligence · 1 min ·

All Content

[2511.23055] MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents
Robotics

[2511.23055] MindPower: Enabling Theory-of-Mind Reasoning in VLM-based Embodied Agents

The paper presents MindPower, a framework that enhances embodied agents' decision-making by integrating Theory of Mind (ToM) reasoning, o...

arXiv - AI · 3 min ·
[2510.07172] NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents
Llms

[2510.07172] NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

The article introduces NewtonBench, a new benchmark for evaluating large language models (LLMs) in scientific law discovery, addressing k...

arXiv - AI · 4 min ·
[2510.02276] BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals
Machine Learning

[2510.02276] BioX-Bridge: Model Bridging for Unsupervised Cross-Modal Knowledge Transfer across Biosignals

The paper introduces BioX-Bridge, a framework for unsupervised cross-modal knowledge transfer in biosignals, enhancing model efficiency w...

arXiv - AI · 4 min ·
[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments
Llms

[2509.25609] A Framework for Studying AI Agent Behavior: Evidence from Consumer Choice Experiments

This article presents a framework for evaluating AI agent behavior through consumer choice experiments, highlighting biases in decision-m...

arXiv - AI · 4 min ·
[2509.21825] DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries
Llms

[2509.21825] DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries

The paper introduces DS-STAR, a data science agent designed to automate complex workflows by integrating diverse data formats and generat...

arXiv - AI · 3 min ·
[2508.19113] Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning
Machine Learning

[2508.19113] Hybrid Deep Searcher: Scalable Parallel and Sequential Search Reasoning

The paper presents HybridDeepSearcher, a novel approach that enhances search reasoning by integrating parallel query expansion with struc...

arXiv - AI · 4 min ·
[2508.13404] TASER: Table Agents for Schema-guided Extraction and Recommendation
Nlp

[2508.13404] TASER: Table Agents for Schema-guided Extraction and Recommendation

The paper presents TASER, a system designed for schema-guided extraction and recommendation from complex financial tables, improving data...

arXiv - Machine Learning · 4 min ·
[2508.01012] AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents
Llms

[2508.01012] AutoEDA: Enabling EDA Flow Automation through Microservice-Based LLM Agents

The article presents AutoEDA, a framework that utilizes microservice-based LLM agents to automate Electronic Design Automation (EDA) proc...

arXiv - AI · 4 min ·
[2506.04867] Sensory-Motor Control with Large Language Models via Iterative Policy Refinement
Llms

[2506.04867] Sensory-Motor Control with Large Language Models via Iterative Policy Refinement

This paper presents a novel method for enabling large language models (LLMs) to control embodied agents through iterative policy refineme...

arXiv - Machine Learning · 4 min ·
[2503.12434] A Survey on the Optimization of Large Language Model-based Agents
Llms

[2503.12434] A Survey on the Optimization of Large Language Model-based Agents

This survey reviews optimization techniques for Large Language Model (LLM)-based agents, categorizing methods into parameter-driven and p...

arXiv - AI · 4 min ·
[2602.21204] Test-Time Training with KV Binding Is Secretly Linear Attention
Machine Learning

[2602.21204] Test-Time Training with KV Binding Is Secretly Linear Attention

This paper explores the concept of Test-Time Training (TTT) with KV binding, revealing that it functions as learned linear attention rath...

arXiv - Machine Learning · 3 min ·
[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs
Llms

[2602.21198] Learning from Trials and Errors: Reflective Test-Time Planning for Embodied LLMs

This article presents a novel approach called Reflective Test-Time Planning for embodied LLMs, enabling robots to learn from mistakes thr...

arXiv - Machine Learning · 4 min ·
[2602.21174] Efficient Hierarchical Any-Angle Path Planning on Multi-Resolution 3D Grids
Robotics

[2602.21174] Efficient Hierarchical Any-Angle Path Planning on Multi-Resolution 3D Grids

This paper presents an efficient hierarchical approach for any-angle path planning on multi-resolution 3D grids, addressing scalability i...

arXiv - AI · 3 min ·
[2602.21136] SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery
Llms

[2602.21136] SparkMe: Adaptive Semi-Structured Interviewing for Qualitative Insight Discovery

The paper presents SparkMe, a multi-agent LLM system designed for adaptive semi-structured interviewing, enhancing qualitative data colle...

arXiv - AI · 4 min ·
[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems
Llms

[2602.21127] "Are You Sure?": An Empirical Study of Human Perception Vulnerability in LLM-Driven Agentic Systems

This study investigates human vulnerability to deception by large language model (LLM) agents, revealing significant trust issues in high...

arXiv - AI · 4 min ·
[2602.21119] Cooperative-Competitive Team Play of Real-World Craft Robots
Machine Learning

[2602.21119] Cooperative-Competitive Team Play of Real-World Craft Robots

The paper explores advancements in multi-agent reinforcement learning for training cooperative and competitive robots, introducing a nove...

arXiv - AI · 3 min ·
[2602.21072] Localized Dynamics-Aware Domain Adaption for Off-Dynamics Offline Reinforcement Learning
Machine Learning

[2602.21072] Localized Dynamics-Aware Domain Adaption for Off-Dynamics Offline Reinforcement Learning

The paper presents Localized Dynamics-Aware Domain Adaptation (LoDADA) for off-dynamics offline reinforcement learning, enhancing data se...

arXiv - Machine Learning · 3 min ·
[2602.21092] Probing Graph Neural Network Activation Patterns Through Graph Topology
Machine Learning

[2602.21092] Probing Graph Neural Network Activation Patterns Through Graph Topology

This article explores the relationship between graph topology and activation patterns in Graph Neural Networks (GNNs), revealing insights...

arXiv - Machine Learning · 3 min ·
[2602.21052] Position-Aware Sequential Attention for Accurate Next Item Recommendations
Machine Learning

[2602.21052] Position-Aware Sequential Attention for Accurate Next Item Recommendations

The paper presents a novel kernelized self-attention mechanism designed to enhance next-item recommendations by improving the representat...

arXiv - Machine Learning · 3 min ·
[2602.20980] CrystaL: Spontaneous Emergence of Visual Latents in MLLMs
Llms

[2602.20980] CrystaL: Spontaneous Emergence of Visual Latents in MLLMs

The paper presents CrystaL, a novel framework for Multimodal Large Language Models (MLLMs) that enhances visual understanding by crystall...

arXiv - AI · 3 min ·
Previous Page 60 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime