Machine Learning Ai Agents

[2602.15640] Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications

arXiv - Machine Learning February 18, 2026 3 min read Article

Summary

The paper presents a framework for latency-aware human-in-the-loop reinforcement learning in semantic communications, addressing the balance between semantic fidelity and latency in critical services.

Why It Matters

As communication systems evolve, ensuring timely and accurate data transmission becomes crucial, especially in safety-critical applications. This research introduces a novel approach that integrates human feedback into reinforcement learning, enhancing the efficiency of semantic communication systems while meeting strict latency requirements.

Key Takeaways

Introduces a time-constrained human-in-the-loop reinforcement learning framework.
Balances semantic fidelity with strict latency requirements for immersive services.
Utilizes a constrained Markov decision process to optimize human feedback integration.
Demonstrates improved performance over baseline schedulers in simulations.
Provides a practical blueprint for latency-aware semantic adaptation in communication networks.

Electrical Engineering and Systems Science > Signal Processing arXiv:2602.15640 (eess) [Submitted on 17 Feb 2026] Title:Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications Authors:Peizheng Li, Xinyi Lin, Adnan Aijaz View a PDF of the paper titled Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications, by Peizheng Li and 2 other authors View PDF HTML (experimental) Abstract:Semantic communication promises task-aligned transmission but must reconcile semantic fidelity with stringent latency guarantees in immersive and safety-critical services. This paper introduces a time-constrained human-in-the-loop reinforcement learning (TC-HITL-RL) framework that embeds human feedback, semantic utility, and latency control within a semantic-aware Open radio access network (RAN) architecture. We formulate semantic adaptation driven by human feedback as a constrained Markov decision process (CMDP) whose state captures semantic quality, human preferences, queue slack, and channel dynamics, and solve it via a primal--dual proximal policy optimization algorithm with action shielding and latency-aware reward shaping. The resulting policy preserves PPO-level semantic rewards while tightening the variability of both air-interface and near-real-time RAN intelligent controller processing budgets. Simulations over point-to-multipoint links with heterogeneous deadlines show that TC-HITL-RL consistently meets per-user timing constraints, out...

Read Original Article

[2602.15640] Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications

Summary

Why It Matters

Key Takeaways

Related Articles

Why does Multi-Agent RL fail to act like a real society in Spatial Game Theory? [P] [R]

How does the ML community view AI-assisted writing in technical discussions? [D]

Danger Words - Where Words Are Weapons

The Download: an exclusive Jeff VanderMeer story and AI models too scary to release | MIT Technology Review

No comments

Stay updated with AI News