[2602.15640] Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications

[2602.15640] Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications

arXiv - Machine Learning 3 min read Article

Summary

The paper presents a framework for latency-aware human-in-the-loop reinforcement learning in semantic communications, addressing the balance between semantic fidelity and latency in critical services.

Why It Matters

As communication systems evolve, ensuring timely and accurate data transmission becomes crucial, especially in safety-critical applications. This research introduces a novel approach that integrates human feedback into reinforcement learning, enhancing the efficiency of semantic communication systems while meeting strict latency requirements.

Key Takeaways

  • Introduces a time-constrained human-in-the-loop reinforcement learning framework.
  • Balances semantic fidelity with strict latency requirements for immersive services.
  • Utilizes a constrained Markov decision process to optimize human feedback integration.
  • Demonstrates improved performance over baseline schedulers in simulations.
  • Provides a practical blueprint for latency-aware semantic adaptation in communication networks.

Electrical Engineering and Systems Science > Signal Processing arXiv:2602.15640 (eess) [Submitted on 17 Feb 2026] Title:Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications Authors:Peizheng Li, Xinyi Lin, Adnan Aijaz View a PDF of the paper titled Latency-aware Human-in-the-Loop Reinforcement Learning for Semantic Communications, by Peizheng Li and 2 other authors View PDF HTML (experimental) Abstract:Semantic communication promises task-aligned transmission but must reconcile semantic fidelity with stringent latency guarantees in immersive and safety-critical services. This paper introduces a time-constrained human-in-the-loop reinforcement learning (TC-HITL-RL) framework that embeds human feedback, semantic utility, and latency control within a semantic-aware Open radio access network (RAN) architecture. We formulate semantic adaptation driven by human feedback as a constrained Markov decision process (CMDP) whose state captures semantic quality, human preferences, queue slack, and channel dynamics, and solve it via a primal--dual proximal policy optimization algorithm with action shielding and latency-aware reward shaping. The resulting policy preserves PPO-level semantic rewards while tightening the variability of both air-interface and near-real-time RAN intelligent controller processing budgets. Simulations over point-to-multipoint links with heterogeneous deadlines show that TC-HITL-RL consistently meets per-user timing constraints, out...

Related Articles

Machine Learning

Why does Multi-Agent RL fail to act like a real society in Spatial Game Theory? [P] [R]

Hey everyone, I’m building a project for my university Machine Learning course called "Social network analysis using iterated game theory...

Reddit - Machine Learning · 1 min ·
Llms

How does the ML community view AI-assisted writing in technical discussions? [D]

I've noticed an interesting contrast between professional and casual technical discussions. In the corporate engineering environment wher...

Reddit - Machine Learning · 1 min ·
Machine Learning

Danger Words - Where Words Are Weapons

Every profession has its danger words - small words that carry hidden judgements while pretending to be neutral. I learned to hear them w...

Reddit - Artificial Intelligence · 1 min ·
The Download: an exclusive Jeff VanderMeer story and AI models too scary to release | MIT Technology Review
Machine Learning

The Download: an exclusive Jeff VanderMeer story and AI models too scary to release | MIT Technology Review

OpenAI has joined Anthropic in restricting an AI model's release over security fears.

MIT Technology Review - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime