[2603.02680] LLMs for High-Frequency Decision-Making: Normalized

[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

arXiv - AI March 04, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.02680: LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

Computer Science > Artificial Intelligence arXiv:2603.02680 (cs) [Submitted on 3 Mar 2026] Title:LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization Authors:Yang Zhao, Zihao Li, Zhiyu Jiang, Dandan Ma, Ganchao Liu, Wenzhe Zhao View a PDF of the paper titled LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization, by Yang Zhao and Zihao Li and Zhiyu Jiang and Dandan Ma and Ganchao Liu and Wenzhe Zhao View PDF HTML (experimental) Abstract:While Large Language Models (LLMs) form the cornerstone of sequential decision-making agent development, they have inherent limitations in high-frequency decision tasks. Existing research mainly focuses on discrete embodied decision scenarios with low-frequency and significant semantic differences in state space (e.g., household planning). These methods suffer from limited performance in high-frequency decision-making tasks, since high-precision numerical state information in such tasks undergoes frequent updates with minimal fluctuations, and exhibiting policy misalignment between the learned sub-tasks and composite tasks. To address these issues, this paper proposes Normalized Action Reward guided Consistency Policy Optimization (NAR-CP). 1) Our method first acquires predefined dense rewards from environmental feedback of candidate actions via reward functions, then completes reward shaping through normalization, and theoretically verifie...

Originally published on March 04, 2026. Curated by AI News.

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General · about 1 hour ago

Llms

Gemini gets major upgrade towards interactive AI learning

AI News - General · 3 min · about 2 hours ago

Llms

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

Anthropic launches Claude Managed Agents in public beta — composable APIs for shipping production AI agents 10x faster Handles sandboxing...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

Six months ago I committed to using AI tools for everything I possibly could in my work. Every day, every task, every workflow. Here's th...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.02680] LLMs for High-Frequency Decision-Making: Normalized Action Reward-Guided Consistency Policy Optimization

About this article

Related Articles

8 free AI courses from Anthropic’s Claude platform with certificates

Gemini gets major upgrade towards interactive AI learning

Anthropic launches Claude Managed Agents — composable APIs for shipping production AI agents 10x faster. Notion, Rakuten, Asana, and Sentry already in production.

6 Months Using AI for Actual Work: What's Incredible, What's Overhyped, and What's Quietly Dangerous

No comments

Stay updated with AI News