[2602.21307] SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks
Summary
SymTorch is a new library that automates the symbolic distillation of deep neural networks, converting them into interpretable mathematical expressions to enhance understanding and integration in workflows.
Why It Matters
This framework addresses the engineering challenges that have limited the adoption of symbolic regression in deep learning, potentially enabling more interpretable AI models. By improving model transparency, it can aid in discovering physical laws and enhance the efficiency of large language models (LLMs).
Key Takeaways
- SymTorch automates the symbolic distillation process for deep learning models.
- It addresses key engineering challenges like data transfer and model serialization.
- The library has been tested across various architectures, including GNNs and transformers.
- A proof-of-concept shows an 8.3% throughput improvement in LLM inference.
- This approach can lead to more interpretable AI models and facilitate the discovery of mathematical relationships.
Computer Science > Machine Learning arXiv:2602.21307 (cs) [Submitted on 24 Feb 2026] Title:SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks Authors:Elizabeth S.Z. Tan, Adil Soubki, Miles Cranmer View a PDF of the paper titled SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks, by Elizabeth S.Z. Tan and 1 other authors View PDF HTML (experimental) Abstract:Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited due to the engineering barrier of integrating symbolic regression into deep learning workflows. We introduce SymTorch, a library that automates this distillation by wrapping neural network components, collecting their input-output behavior, and approximating them with human-readable equations via PySR. SymTorch handles the engineering challenges that have hindered adoption: GPU-CPU data transfer, input-output caching, model serialization, and seamless switching between neural and symbolic forward passes. We demonstrate SymTorch across diverse architectures including GNNs, PINNs and transformer models. Finally, we present a proof-of-concept for accelerating LLM inference by replacing MLP layers with symbolic surrogates, achieving an 8.3\% throughput improvement with moderate performance degr...