[2602.19187] Adaptive Problem Generation via Symbolic Representations
Summary
This article presents a novel method for generating training data for reinforcement learning using symbolic representations, enhancing the performance of small open-weight language models in mathematical tasks.
Why It Matters
The research addresses limitations in current data generation methods for reinforcement learning, which often lack adaptability and precision. By introducing a closed-loop framework for problem generation, this work has implications for improving AI's mathematical reasoning capabilities, potentially benefiting educational tools and AI applications in various domains.
Key Takeaways
- Introduces a method for adaptive problem generation using symbolic representations.
- Enhances control over problem structure and solution generation.
- Demonstrates improved mathematical problem-solving abilities in AI models.
- Utilizes a closed-loop framework for adapting problem difficulty.
- Contributes to the field of reinforcement learning and AI education.
Computer Science > Machine Learning arXiv:2602.19187 (cs) [Submitted on 22 Feb 2026] Title:Adaptive Problem Generation via Symbolic Representations Authors:Teresa Yeo, Myeongho Jeon, Dulaj Weerakoon, Rui Qiao, Alok Prakash, Armando Solar-Lezama, Archan Misra View a PDF of the paper titled Adaptive Problem Generation via Symbolic Representations, by Teresa Yeo and 6 other authors View PDF HTML (experimental) Abstract:We present a method for generating training data for reinforcement learning with verifiable rewards to improve small open-weights language models on mathematical tasks. Existing data generation approaches rely on open-loop pipelines and fixed modifications that do not adapt to the model's capabilities. Furthermore, they typically operate directly on word problems, limiting control over problem structure. To address this, we perform modifications in a symbolic problem space, representing each problem as a set of symbolic variables and constraints (e.g., via algebraic frameworks such as SymPy or SMT formulations). This representation enables precise control over problem structure, automatic generation of ground-truth solutions, and decouples mathematical reasoning from linguistic realization. We also show that this results in more diverse generations. To adapt the problem difficulty to the model, we introduce a closed-loop framework that learns modification strategies through prompt optimization in symbolic space. Experimental results demonstrate that both adapti...