AI Agents

Autonomous agents, tool use, and agentic systems

Top This Week

Machine Learning

[D] Your Agent, Their Asset: Real-world safety evaluation of OpenClaw agents (CIK poisoning raises attack success to ~64–74%)

Paper: https://arxiv.org/abs/2604.04759 This paper presents a real-world safety evaluation of OpenClaw, a personal AI agent with access t...

Reddit - Machine Learning · 1 min ·
New privacy tool helps detect when AI agents become double agents
Ai Agents

New privacy tool helps detect when AI agents become double agents

RIT cybersecurity researchers have developed AudAgent, a tool that detects when agentic AI collects, processes, or shares highly sensitiv...

AI Tools & Products · 5 min ·
Ai Agents

Microsoft’s GitHub Sees Booming Traffic—and Outages—as AI Agents Flood Platform

AI Tools & Products ·

All Content

[2511.10515] Mastering Olympiad-Level Physics with Artificial Intelligence
Machine Learning

[2511.10515] Mastering Olympiad-Level Physics with Artificial Intelligence

This paper presents LOCA, an AI framework designed to tackle Olympiad-level physics problems by breaking down complex reasoning into mana...

arXiv - AI · 4 min ·
[2510.20091] CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity
Llms

[2510.20091] CreativityPrism: A Holistic Evaluation Framework for Large Language Model Creativity

The paper presents CreativityPrism, a comprehensive framework for evaluating the creativity of large language models (LLMs) across variou...

arXiv - AI · 4 min ·
[2510.12768] Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction
Machine Learning

[2510.12768] Uncertainty Matters in Dynamic Gaussian Splatting for Monocular 4D Reconstruction

This paper presents USplat4D, a novel framework for monocular 4D reconstruction that incorporates uncertainty in dynamic Gaussian splatti...

arXiv - AI · 4 min ·
[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink
Machine Learning

[2602.10956] Stochastic Parroting in Temporal Attention -- Regulating the Diagonal Sink

The paper explores the challenges of spatio-temporal models in machine learning, focusing on biases in temporal attention mechanisms and ...

arXiv - Machine Learning · 3 min ·
[2510.06200] StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars
Llms

[2510.06200] StarEmbed: Benchmarking Time Series Foundation Models on Astronomical Observations of Variable Stars

The paper introduces StarEmbed, a benchmark for evaluating time series foundation models on astronomical observations of variable stars, ...

arXiv - AI · 4 min ·
[2510.04694] Multilingual Routing in Mixture-of-Experts
Llms

[2510.04694] Multilingual Routing in Mixture-of-Experts

This paper explores multilingual routing in Mixture-of-Experts (MoE) architectures, revealing how these models handle multilingual data a...

arXiv - Machine Learning · 4 min ·
[2602.05139] Adaptive Exploration for Latent-State Bandits
Machine Learning

[2602.05139] Adaptive Exploration for Latent-State Bandits

The paper presents adaptive exploration strategies for latent-state bandits, addressing challenges in reward estimation and action select...

arXiv - Machine Learning · 3 min ·
[2509.05249] COGITAO: A Visual Reasoning Framework To Study Compositionality & Generalization
Machine Learning

[2509.05249] COGITAO: A Visual Reasoning Framework To Study Compositionality & Generalization

COGITAO introduces a novel framework for studying compositionality and generalization in visual reasoning, offering extensive task genera...

arXiv - AI · 4 min ·
[2508.08177] MedReasoner: Reinforcement Learning Drives Reasoning Grounding from Clinical Thought to Pixel-Level Precision
Llms

[2508.08177] MedReasoner: Reinforcement Learning Drives Reasoning Grounding from Clinical Thought to Pixel-Level Precision

The paper introduces MedReasoner, a framework that utilizes reinforcement learning for precise medical reasoning and pixel-level groundin...

arXiv - AI · 4 min ·
[2601.11616] Mixture-of-Experts as Soft Clustering: A Dual Jacobian-PCA Spectral Geometry Perspective
Machine Learning

[2601.11616] Mixture-of-Experts as Soft Clustering: A Dual Jacobian-PCA Spectral Geometry Perspective

This paper explores Mixture-of-Experts (MoE) architectures through a geometric lens, analyzing their impact on function representation an...

arXiv - Machine Learning · 4 min ·
[2508.01067] Expressive Power of Graph Transformers via Logic
Llms

[2508.01067] Expressive Power of Graph Transformers via Logic

This paper explores the expressive power of graph transformers, comparing their capabilities under different logical frameworks, particul...

arXiv - AI · 3 min ·
[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency
Machine Learning

[2506.08822] FreqPolicy: Efficient Flow-based Visuomotor Policy via Frequency Consistency

The paper presents FreqPolicy, a novel flow-based visuomotor policy that enhances efficiency in robotic manipulation by imposing frequenc...

arXiv - AI · 4 min ·
[2511.22581] High entropy leads to symmetry equivariant policies in Dec-POMDPs
Ai Startups

[2511.22581] High entropy leads to symmetry equivariant policies in Dec-POMDPs

This paper explores how high entropy regularization in Dec-POMDPs leads to symmetry equivariant policies, ensuring convergence to a consi...

arXiv - Machine Learning · 4 min ·
[2505.03795] Modeling Human Behavior in a Strategic Network Game with Complex Group Dynamics
Machine Learning

[2505.03795] Modeling Human Behavior in a Strategic Network Game with Complex Group Dynamics

This article explores modeling human behavior in strategic network games, focusing on the Junior High Game (JHG) and comparing various be...

arXiv - AI · 4 min ·
[2511.03710] Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards
Machine Learning

[2511.03710] Shrinking the Variance: Shrinkage Baselines for Reinforcement Learning with Verifiable Rewards

This article presents a novel approach to reducing variance in reinforcement learning through shrinkage baselines, enhancing training sta...

arXiv - Machine Learning · 3 min ·
[2504.08603] FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment
Robotics

[2504.08603] FindAnything: Open-Vocabulary and Object-Centric Mapping for Robot Exploration in Any Environment

The paper presents FindAnything, a framework for open-vocabulary and object-centric mapping that enhances robot exploration in unknown en...

arXiv - AI · 4 min ·
[2510.24318] Transformers can do Bayesian Clustering
Machine Learning

[2510.24318] Transformers can do Bayesian Clustering

The paper presents Cluster-PFN, a Transformer-based model for unsupervised Bayesian clustering, demonstrating improved accuracy and speed...

arXiv - Machine Learning · 3 min ·
[2510.19753] Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data
Machine Learning

[2510.19753] Transformers Provably Learn Algorithmic Solutions for Graph Connectivity, But Only with the Right Data

The paper explores how Transformers can learn algorithmic solutions for graph connectivity, demonstrating that success depends on the tra...

arXiv - Machine Learning · 3 min ·
[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes
Llms

[2503.12286] Integrating Chain-of-Thought and Retrieval Augmented Generation Enhances Rare Disease Diagnosis from Clinical Notes

This article presents a novel approach combining Chain-of-Thought (CoT) and Retrieval Augmented Generation (RAG) to improve rare disease ...

arXiv - AI · 4 min ·
[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation
Generative Ai

[2502.17863] A Survey: Spatiotemporal Consistency in Video Generation

This survey reviews advancements in spatiotemporal consistency in video generation, addressing challenges and methodologies in creating c...

arXiv - AI · 4 min ·
Previous Page 108 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime