AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Enabling agent-first process redesign | MIT Technology Review
Nlp

Enabling agent-first process redesign | MIT Technology Review

Unlike static, rules-based systems, AI agents can learn, adapt, and optimize processes dynamically. As they interact with data, systems, ...

MIT Technology Review - AI · 4 min ·
Llms

Stop Overcomplicating AI Workflows. This Is the Simple Framework

I’ve been working on building an agentic AI workflow system for business use cases and one thing became very clear very quickly. This is ...

Reddit - Artificial Intelligence · 1 min ·
Ai Agents

The "Jarvis on day one" trap: why trying to build one AI agent that does everything costs you months

Something I've been thinking about after spending a few months actually trying to build my own AI agent: the biggest trap in this space i...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study
Llms

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

The paper presents Verbalized Action Masking (VAM), a novel method for enhancing exploration in reinforcement learning (RL) post-training...

arXiv - AI · 4 min ·
[2602.16826] HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind
Machine Learning

[2602.16826] HiVAE: Hierarchical Latent Variables for Scalable Theory of Mind

The paper presents HiVAE, a hierarchical variational architecture designed to enhance AI's theory of mind capabilities, enabling better i...

arXiv - AI · 3 min ·
[2602.16820] AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course
Llms

[2602.16820] AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course

This study investigates the effectiveness of AI-mediated feedback on student revisions in a large undergraduate course, revealing that AI...

arXiv - AI · 3 min ·
[2602.16755] PREFER: An Ontology for the PREcision FERmentation Community
Data Science

[2602.16755] PREFER: An Ontology for the PREcision FERmentation Community

The PREFER ontology aims to standardize data in precision fermentation, enhancing interoperability and data accessibility across bioproce...

arXiv - AI · 4 min ·
[2602.16746] Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking
Machine Learning

[2602.16746] Low-Dimensional and Transversely Curved Optimization Dynamics in Grokking

This article presents a geometric analysis of optimization dynamics in transformers, focusing on the phenomenon of grokking, where models...

arXiv - AI · 4 min ·
[2602.16745] PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency
Machine Learning

[2602.16745] PETS: A Principled Framework Towards Optimal Trajectory Allocation for Efficient Test-Time Self-Consistency

The paper presents PETS, a framework for optimal trajectory allocation aimed at enhancing test-time self-consistency in machine learning ...

arXiv - AI · 4 min ·
[2602.16742] DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning
Machine Learning

[2602.16742] DeepVision-103K: A Visually Diverse, Broad-Coverage, and Verifiable Mathematical Dataset for Multimodal Reasoning

DeepVision-103K introduces a comprehensive dataset designed to enhance reinforcement learning with verifiable rewards, significantly impr...

arXiv - AI · 3 min ·
[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution
Machine Learning

[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

The paper presents a deterministic semantic state substrate for AI, demonstrating a novel compute envelope that maintains performance acr...

arXiv - AI · 4 min ·
[2602.16720] APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL
Llms

[2602.16720] APEX-SQL: Talking to the data via Agentic Exploration for Text-to-SQL

The paper presents APEX-SQL, a novel framework for Text-to-SQL that enhances interaction with complex databases through agentic explorati...

arXiv - AI · 4 min ·
[2602.17663] CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts
Ai Startups

[2602.17663] CLEF HIPE-2026: Evaluating Accurate and Efficient Person-Place Relation Extraction from Multilingual Historical Texts

The paper presents the HIPE-2026 evaluation lab focused on extracting person-place relations from multilingual historical texts, enhancin...

arXiv - AI · 3 min ·
[2602.17607] AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing
Machine Learning

[2602.17607] AutoNumerics: An Autonomous, PDE-Agnostic Multi-Agent Pipeline for Scientific Computing

AutoNumerics is a multi-agent framework that autonomously designs and verifies numerical solvers for PDEs from natural language, outperfo...

arXiv - Machine Learning · 3 min ·
[2602.17594] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games
Ai Startups

[2602.17594] AI Gamestore: Scalable, Open-Ended Evaluation of Machine General Intelligence with Human Games

The paper introduces the AI Gamestore, a platform for evaluating machine general intelligence through human games, highlighting its poten...

arXiv - AI · 4 min ·
[2602.17547] KLong: Training LLM Agent for Extremely Long-horizon Tasks
Llms

[2602.17547] KLong: Training LLM Agent for Extremely Long-horizon Tasks

The paper presents KLong, an open-source LLM agent designed for solving extremely long-horizon tasks by utilizing trajectory-splitting SF...

arXiv - AI · 3 min ·
[2602.17544] Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability
Llms

[2602.17544] Evaluating Chain-of-Thought Reasoning through Reusability and Verifiability

This paper evaluates Chain-of-Thought (CoT) reasoning in AI through new metrics of reusability and verifiability, revealing limitations o...

arXiv - AI · 3 min ·
[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval
Machine Learning

[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval

The paper presents a novel framework integrating formal verification with deep learning for improved image retrieval, addressing the limi...

arXiv - AI · 4 min ·
[2602.17308] MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions
Llms

[2602.17308] MedClarify: An information-seeking AI agent for medical diagnosis with case-specific follow-up questions

MedClarify is an AI agent designed to enhance medical diagnosis by generating case-specific follow-up questions, improving diagnostic acc...

arXiv - Machine Learning · 4 min ·
[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web
Llms

[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web

The paper introduces 'Web Verbs', a set of typed abstractions designed to improve task composition on the Agentic Web, enhancing reliabil...

arXiv - AI · 4 min ·
[2602.17222] Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight
Llms

[2602.17222] Decoding the Human Factor: High Fidelity Behavioral Prediction for Strategic Foresight

The paper presents the Large Behavioral Model (LBM), a novel AI framework designed to enhance the prediction of human decision-making in ...

arXiv - AI · 4 min ·
[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences
Llms

[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

This article explores a methodological experiment using AI agents to enhance research in Taiwan's humanities and social sciences, proposi...

arXiv - AI · 4 min ·
[2602.17217] Continual learning and refinement of causal models through dynamic predicate invention
Machine Learning

[2602.17217] Continual learning and refinement of causal models through dynamic predicate invention

This paper presents a framework for online construction of symbolic causal world models, enhancing agents' decision-making through contin...

arXiv - AI · 3 min ·
Previous Page 104 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime