AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Ai Agents

Considering NeurIPS submission [D]

Wondering if it worth submitting paper I’m working on to NeurIPS. I have formal mathematical proof for convergence of a novel agentic sys...

Reddit - Machine Learning · 1 min ·
Ai Agents

Agent frameworks waste ~350,000+ tokens per session resending static files. 95% reduction benchmarked.

Measured the actual token waste on a local Qwen 3.5 122B setup. The numbers are unreal. Found a compile-time approach that cuts query con...

Reddit - Artificial Intelligence · 1 min ·
OpenClaw gives users yet another reason to be freaked out about security - Ars Technica
Ai Agents

OpenClaw gives users yet another reason to be freaked out about security - Ars Technica

The viral AI agentic tool let attackers silently gain admin unauthenticated access.

Ars Technica - AI · 5 min ·

All Content

[2602.20639] Grounding LLMs in Scientific Discovery via Embodied Actions
Llms

[2602.20639] Grounding LLMs in Scientific Discovery via Embodied Actions

The paper presents EmbodiedAct, a framework that enhances Large Language Models (LLMs) by grounding them in embodied actions for scientif...

arXiv - AI · 3 min ·
[2602.20638] Identifying two piecewise linear additive value functions from anonymous preference information
Machine Learning

[2602.20638] Identifying two piecewise linear additive value functions from anonymous preference information

The paper discusses a method for identifying two piecewise linear additive value functions from anonymous preference information, enhanci...

arXiv - AI · 3 min ·
[2602.20628] When can we trust untrusted monitoring? A safety case sketch across collusion strategies
Machine Learning

[2602.20628] When can we trust untrusted monitoring? A safety case sketch across collusion strategies

This paper explores the challenges of ensuring safety in AI systems using untrusted monitoring. It develops a taxonomy of collusion strat...

arXiv - AI · 4 min ·
[2602.20571] CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation
Machine Learning

[2602.20571] CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation

The CausalReasoningBenchmark introduces a new framework for evaluating automated causal inference, distinguishing between identification ...

arXiv - AI · 4 min ·
[2602.20517] Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination
Machine Learning

[2602.20517] Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination

The paper presents MIMIC, a framework that enhances human-AI coordination by using inner speech to guide behavior imitation in artificial...

arXiv - Machine Learning · 4 min ·
[2602.20502] ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory
Llms

[2602.20502] ActionEngine: From Reactive to Programmatic GUI Agents via State Machine Memory

The paper presents ActionEngine, a novel framework that enhances GUI agents by transitioning from reactive execution to programmatic plan...

arXiv - Machine Learning · 4 min ·
[2602.20494] KairosVL: Orchestrating Time Series and Semantics for Unified Reasoning
Machine Learning

[2602.20494] KairosVL: Orchestrating Time Series and Semantics for Unified Reasoning

The paper introduces KairosVL, a novel framework that enhances time series analysis by integrating semantic reasoning, achieving competit...

arXiv - AI · 3 min ·
[2602.20459] PreScience: A Benchmark for Forecasting Scientific Contributions
Machine Learning

[2602.20459] PreScience: A Benchmark for Forecasting Scientific Contributions

The paper introduces PreScience, a benchmark for forecasting scientific contributions using AI. It evaluates four generative tasks relate...

arXiv - AI · 4 min ·
[2602.20426] Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use
Llms

[2602.20426] Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

This paper presents Trace-Free+, a curriculum learning framework designed to enhance the quality of tool interfaces for LLM-based agents,...

arXiv - AI · 3 min ·
[2602.20424] Implicit Intelligence -- Evaluating Agents on What Users Don't Say
Ai Agents

[2602.20424] Implicit Intelligence -- Evaluating Agents on What Users Don't Say

The paper presents an evaluation framework called Implicit Intelligence, which assesses AI agents' ability to understand unstated user re...

arXiv - AI · 3 min ·
[2602.20422] Diffusion Modulation via Environment Mechanism Modeling for Planning
Machine Learning

[2602.20422] Diffusion Modulation via Environment Mechanism Modeling for Planning

The paper presents a novel approach, Diffusion Modulation via Environment Mechanism Modeling (DMEMM), to enhance trajectory generation in...

arXiv - Machine Learning · 3 min ·
[2602.20333] DMCD: Semantic-Statistical Framework for Causal Discovery
Llms

[2602.20333] DMCD: Semantic-Statistical Framework for Causal Discovery

The DMCD framework integrates LLM-based semantic drafting with statistical validation for causal discovery, enhancing performance across ...

arXiv - AI · 3 min ·
Generative Ai

Knowledge is the key to unlocking AI's full potential as a creative tool

The article discusses the importance of knowledge in maximizing the creative potential of AI tools, suggesting that informed users will a...

Reddit - Artificial Intelligence · 1 min ·
Llms

Is your linux shell too safe? Then add AI!

The article discusses the integration of AI into Linux shell commands, suggesting that AI can enhance user experience by simplifying comp...

Reddit - Artificial Intelligence · 1 min ·
Uber engineers built an AI version of their boss | TechCrunch
Ai Agents

Uber engineers built an AI version of their boss | TechCrunch

Uber engineers have developed an AI chatbot modeled after CEO Dara Khosrowshahi to enhance their presentation preparation, reflecting the...

TechCrunch - AI · 4 min ·
Ai Safety

Anthropic believes RSI (recursive self improvement) could arrive “as soon as early 2027”

Anthropic predicts that recursive self-improvement (RSI) in AI could be realized as early as 2027, highlighting significant advancements ...

Reddit - Artificial Intelligence · 1 min ·
A I-designed proteins may help spot cancer | MIT Technology Review
Machine Learning

A I-designed proteins may help spot cancer | MIT Technology Review

MIT and Microsoft researchers have developed AI-designed molecular sensors that can detect early cancer signs through urine tests, enhanc...

MIT Technology Review · 3 min ·
Using big data for good | MIT Technology Review
Data Science

Using big data for good | MIT Technology Review

Charlie Lieu's work with Darwin's Ark leverages big data to enhance pet genetics research, improving understanding of health and behavior...

MIT Technology Review · 13 min ·
Ai Safety

Hegseth warns Anthropic to let the military use the company’s AI tech as it sees fit, AP source says

Hegseth urges Anthropic to allow military access to its AI technology, emphasizing the importance of defense applications in AI development.

Reddit - Artificial Intelligence · 1 min ·
Ai Agents

Could it understand it faster than human researchers?

The article discusses the potential for artificial intelligence to understand sexual orientation more quickly than human researchers, rai...

Reddit - Artificial Intelligence · 1 min ·
Previous Page 64 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime