AI Agents

Autonomous agents, tool use, and agentic systems

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Nlp

Enabling agent-first process redesign | MIT Technology Review

Unlike static, rules-based systems, AI agents can learn, adapt, and optimize processes dynamically. As they interact with data, systems, ...

MIT Technology Review - AI · 4 min · about 4 hours ago

Llms

Stop Overcomplicating AI Workflows. This Is the Simple Framework

I’ve been working on building an agentic AI workflow system for business use cases and one thing became very clear very quickly. This is ...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Ai Agents

The "Jarvis on day one" trap: why trying to build one AI agent that does everything costs you months

Something I've been thinking about after spending a few months actually trying to build my own AI agent: the biggest trap in this space i...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

All Content

Llms

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

The paper presents Verbalized Action Masking (VAM), a novel method for enhancing exploration in reinforcement learning (RL) post-training...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16826] HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

The paper presents HiVAE, a hierarchical variational architecture designed to enhance AI's theory of mind capabilities, enabling better i...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.16820] AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course

This study investigates the effectiveness of AI-mediated feedback on student revisions in a large undergraduate course, revealing that AI...

arXiv - AI · 3 min · about 2 months ago

Data Science

[2602.16755] PREFER: An Ontology for the PREcision FERmentation Community

The PREFER ontology aims to standardize data in precision fermentation, enhancing interoperability and data accessibility across bioproce...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16746] Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking

This article presents a geometric analysis of optimization dynamics in transformers, focusing on the phenomenon of grokking, where models...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16745] PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

The paper presents PETS, a framework for optimal trajectory allocation aimed at enhancing test-time self-consistency in machine learning ...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16742] DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

DeepVision-103K introduces a comprehensive dataset designed to enhance reinforcement learning with verifiable rewards, significantly impr...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

The paper presents a deterministic semantic state substrate for AI, demonstrating a novel compute envelope that maintains performance acr...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.16720] APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL

The paper presents APEX-SQL, a novel framework for Text-to-SQL that enhances interaction with complex databases through agentic explorati...

arXiv - AI · 4 min · about 2 months ago

Ai Startups

[2602.17663] CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

The paper presents the HIPE-2026 evaluation lab focused on extracting person-place relations from multilingual historical texts, enhancin...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.17607] AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing

AutoNumerics is a multi-agent framework that autonomously designs and verifies numerical solvers for PDEs from natural language, outperfo...

arXiv - Machine Learning · 3 min · about 2 months ago

Ai Startups

[2602.17594] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

The paper introduces the AI Gamestore, a platform for evaluating machine general intelligence through human games, highlighting its poten...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17547] KLong: Training LLM Agent for Extremely Long-horizon Tasks

The paper presents KLong, an open-source LLM agent designed for solving extremely long-horizon tasks by utilizing trajectory-splitting SF...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17544] Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability

This paper evaluates Chain-of-Thought (CoT) reasoning in AI through new metrics of reusability and verifiability, revealing limitations o...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval

The paper presents a novel framework integrating formal verification with deep learning for improved image retrieval, addressing the limi...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17308] MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions

MedClarify is an AI agent designed to enhance medical diagnosis by generating case-specific follow-up questions, improving diagnostic acc...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web

The paper introduces 'Web Verbs', a set of typed abstractions designed to improve task composition on the Agentic Web, enhancing reliabil...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17222] Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight

The paper presents the Large Behavioral Model (LBM), a novel AI framework designed to enhance the prediction of human decision-making in ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

This article explores a methodological experiment using AI agents to enhance research in Taiwan's humanities and social sciences, proposi...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17217] Continual learning and refinement of causal models through dynamic predicate invention

This paper presents a framework for online construction of symbolic causal world models, enhancing agents' decision-making through contin...

arXiv - AI · 3 min · about 2 months ago

Previous Page 104 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Agents

Top This Week

Enabling agent-first process redesign | MIT Technology Review

Stop Overcomplicating AI Workflows. This Is the Simple Framework

The "Jarvis on day one" trap: why trying to build one AI agent that does everything costs you months

All Content

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

[2602.16826] HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

[2602.16820] AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course

[2602.16755] PREFER: An Ontology for the PREcision FERmentation Community

[2602.16746] Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking

[2602.16745] PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

[2602.16742] DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

[2602.16720] APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL

[2602.17663] CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

[2602.17607] AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing

[2602.17594] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

[2602.17547] KLong: Training LLM Agent for Extremely Long-horizon Tasks

[2602.17544] Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability

[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval

[2602.17308] MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions

[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web

[2602.17222] Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight

[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

[2602.17217] Continual learning and refinement of causal models through dynamic predicate invention

Related Topics

Stay updated with AI News