[2602.14857] World Models for Policy Refinement in StarCraft II

[2602.14857] World Models for Policy Refinement in StarCraft II

arXiv - AI 4 min read Article

Summary

The paper presents StarWM, a novel world model for refining decision-making policies in StarCraft II using large language models, demonstrating significant improvements in predictive accuracy and gameplay performance.

Why It Matters

This research addresses the challenge of integrating predictive models into decision-making frameworks for complex environments like StarCraft II, showcasing advancements in AI capabilities. The findings could influence future AI applications in gaming and other real-world scenarios requiring strategic decision-making under uncertainty.

Key Takeaways

  • StarWM introduces a world model that enhances decision-making in StarCraft II.
  • The model predicts future observations under partial observability, improving policy refinement.
  • StarWM shows nearly 60% improvement in resource prediction accuracy.
  • The integrated decision system yields win-rate gains against SC2's AI.
  • Structured textual representation aids in learning SC2's hybrid dynamics.

Computer Science > Artificial Intelligence arXiv:2602.14857 (cs) [Submitted on 16 Feb 2026] Title:World Models for Policy Refinement in StarCraft II Authors:Yixin Zhang, Ziyi Wang, Yiming Rong, Haoxi Wang, Jinling Jiang, Shuang Xu, Haoran Wu, Shiyu Zhou, Bo Xu View a PDF of the paper titled World Models for Policy Refinement in StarCraft II, by Yixin Zhang and 8 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have recently shown strong reasoning and generalization capabilities, motivating their use as decision-making policies in complex environments. StarCraft II (SC2), with its massive state-action space and partial observability, is a challenging testbed. However, existing LLM-based SC2 agents primarily focus on improving the policy itself and overlook integrating a learnable, action-conditioned transition model into the decision loop. To bridge this gap, we propose StarWM, the first world model for SC2 that predicts future observations under partial observability. To facilitate learning SC2's hybrid dynamics, we introduce a structured textual representation that factorizes observations into five semantic modules, and construct SC2-Dynamics-50k, the first instruction-tuning dataset for SC2 dynamics prediction. We further develop a multi-dimensional offline evaluation framework for predicted structured observations. Offline results show StarWM's substantial gains over zero-shot baselines, including nearly 60% improvements in resource predi...

Related Articles

Llms

Looking to build a production-level AI/ML project (agentic systems), need guidance on what to build

Hi everyone, I’m a final-year undergraduate AI/ML student currently focusing on applied AI / agentic systems. So far, I’ve spent time und...

Reddit - ML Jobs · 1 min ·
Llms

Google isn’t an AI-first company despite Gemini being great

Any time I see an article quoting a Google executive about how "successfully" they’ve implemented AI, I roll my eyes. People treat these ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I built a 1,400-line private reflection harness for Claude with a trust contract and a door that closes from the inside. Then I ran a controlled experiment.

I'm a game developer (DIV Games Studio, 1998; Sony London) with 40 years writing engines and systems. Used Claude daily for two years as ...

Reddit - Artificial Intelligence · 1 min ·
Llms

[P] Building a LLM from scratch with Mary Shelley's "Frankenstein" (on Kaggle)

Notebook on GitHub: https://github.com/Buzzpy/Python-Machine-Learning-Models/blob/main/Frankenstein/train-frankenstein.ipynb submitted by...

Reddit - Machine Learning · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime