[2508.04865] Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment
About this article
Abstract page for arXiv paper 2508.04865: Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment
Computer Science > Machine Learning arXiv:2508.04865 (cs) [Submitted on 6 Aug 2025 (v1), last revised 28 Feb 2026 (this version, v2)] Title:Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment Authors:Aleksander Boruch-Gruszecki, Yangtian Zi, Zixuan Wu, Tejas Oberoi, Carolyn Jane Anderson, Joydeep Biswas, Arjun Guha View a PDF of the paper titled Agnostics: Learning to Code in Any Programming Language via Reinforcement with a Universal Learning Environment, by Aleksander Boruch-Gruszecki and 6 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) already excel at writing code in high-resource languages such as Python and JavaScript, yet stumble on low-resource languages that remain essential to science and engineering. Besides the obvious shortage of pre-training data, post-training itself is a bottleneck: every new language seems to require new datasets, test harnesses, and reinforcement-learning (RL) infrastructure. We introduce Agnostics, a language-agnostic post-training pipeline that eliminates this per-language engineering. The key idea is to judge code solely by its externally observable behavior, so a single verifier can test solutions written in any language. Concretely, we (i) use an LLM to rewrite existing unit-test datasets into an I/O format, (ii) supply a short configuration that tells the verifier how to compile and run a target language, and (iii) apply reinforcement learn...