[2602.15858] State Design Matters: How Representations Shape Dynamic Reasoning in Large Language Models
Summary
This paper explores how state representations impact the reasoning capabilities of large language models (LLMs) in dynamic environments, highlighting key design choices that enhance performance.
Why It Matters
As LLMs transition from static tasks to dynamic environments, understanding how state representation affects their reasoning is crucial for improving AI interactions in real-world applications. This research provides insights into optimizing LLM performance, which is vital for developers and researchers in AI.
Key Takeaways
- State representation significantly influences LLM performance in dynamic reasoning tasks.
- Trajectory summarization helps stabilize long-horizon reasoning by reducing noise.
- Natural language representations outperform structured encodings for general robustness.
- Text-based spatial encodings enhance reasoning by engaging models in spatial construction.
- Current LLMs still struggle with long-term reasoning despite improved state representations.
Computer Science > Computation and Language arXiv:2602.15858 (cs) [Submitted on 25 Jan 2026] Title:State Design Matters: How Representations Shape Dynamic Reasoning in Large Language Models Authors:Annie Wong, Aske Plaat, Thomas Bäck, Niki van Stein, Anna V. Kononova View a PDF of the paper titled State Design Matters: How Representations Shape Dynamic Reasoning in Large Language Models, by Annie Wong and 4 other authors View PDF HTML (experimental) Abstract:As large language models (LLMs) move from static reasoning tasks toward dynamic environments, their success depends on the ability to navigate and respond to an environment that changes as they interact at inference time. An underexplored factor in these settings is the representation of the state. Holding model parameters fixed, we systematically vary three key aspects: (1) state granularity (long form versus summary), (2) structure (natural language versus symbolic), and (3) spatial grounding (text-only versus images or textual map encodings) across sequential decision-making benchmarks. We find that trajectory summarisation improves performance by reducing noise and stabilising long-horizon reasoning. Second, natural language representations are the most robust across models, whereas structured encodings help mainly for models with strong code or structured output priors, such as JSON schemas. Third, while image-inputs show some benefit, text-based spatial encodings prove most effective. This advantage stems not fro...