[2604.00076] Learning to Play Blackjack: A Curriculum Learning

[2604.00076] Learning to Play Blackjack: A Curriculum Learning Perspective

arXiv - Machine Learning April 02, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.00076: Learning to Play Blackjack: A Curriculum Learning Perspective

Computer Science > Machine Learning arXiv:2604.00076 (cs) [Submitted on 31 Mar 2026] Title:Learning to Play Blackjack: A Curriculum Learning Perspective Authors:Amirreza Alasti, Efe Erdal, Yücel Celik, Theresa Eimer View a PDF of the paper titled Learning to Play Blackjack: A Curriculum Learning Perspective, by Amirreza Alasti and 3 other authors View PDF HTML (experimental) Abstract:Reinforcement Learning (RL) agents often struggle with efficiency and performance in complex environments. We propose a novel framework that uses a Large Language Model (LLM) to dynamically generate a curriculum over available actions, enabling the agent to incorporate each action individually. We apply this framework to the game of Blackjack, where the LLM creates a multi-stage training path that progressively introduces complex actions to a Tabular Q-Learning and a Deep Q-Network (DQN) agent. Our evaluation in a realistic 8-deck simulation over 10 independent runs demonstrates significant performance gains over standard training methods. The curriculum-based approach increases the DQN agent's average win rate from 43.97% to 47.41%, reduces the average bust rate from 32.9% to 28.0%, and accelerates the overall workflow by over 74%, with the agent's full training completing faster than the baseline's evaluation phase alone. These results validate that LLM-guided curricula can build more effective, robust, and efficient RL agents. Comments: Subjects: Machine Learning (cs.LG); Artificial Intelli...

Originally published on April 02, 2026. Curated by AI News.

Llms

Claude on Claude

The Story of Anthropic’s Latest Controversies Regarding the Business of Its Prized Creation… As Told by the Thing Itself. Editor’s note: ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Cut Claude usage by ~85% in a job search pipeline (16k → 900 tokens/app) — here’s what worked

Like many here, I kept running into Claude usage limits when building anything non-trivial. I was working with a job search automation pi...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

Posted today in light of the Claude Mythos model card release. Originally I wrote this for r/ControlProblem but realized it was getting o...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

AI joins the 8-hour work day as GLM ships 5.1 open source LLM, beating Opus 4.6 and GPT-5.4 on SWE-Bench Pro

AI Tools & Products · about 2 hours ago

[2604.00076] Learning to Play Blackjack: A Curriculum Learning Perspective

About this article

Related Articles

Claude on Claude

Cut Claude usage by ~85% in a job search pipeline (16k → 900 tokens/app) — here’s what worked

"Authoritarian Parents In Rationalist Clothes": a piece I wrote in December about alignment

AI joins the 8-hour work day as GLM ships 5.1 open source LLM, beating Opus 4.6 and GPT-5.4 on SWE-Bench Pro

No comments

Stay updated with AI News