[2509.06085] Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects
Summary
This article investigates the integration and management of pre-trained models (PTMs) in open-source software projects, introducing the concept of Software Dependencies 2.0.
Why It Matters
Understanding how developers reuse and integrate PTMs is crucial as these models become increasingly central to software development. This study sheds light on the potential challenges and best practices in managing these dependencies, which can impact software maintainability and reliability.
Key Takeaways
- PTMs are reshaping software dependencies, necessitating new management strategies.
- The study analyzes 401 GitHub repositories to identify patterns in PTM reuse.
- Developers face challenges in documenting and integrating PTMs effectively.
- The research highlights the importance of understanding PTM interactions within software pipelines.
- Best practices for managing PTMs can enhance software reliability and maintainability.
Computer Science > Software Engineering arXiv:2509.06085 (cs) [Submitted on 7 Sep 2025 (v1), last revised 18 Feb 2026 (this version, v2)] Title:Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects Authors:Jerin Yasmin, Wenxin Jiang, James C. Davis, Yuan Tian View a PDF of the paper titled Software Dependencies 2.0: An Empirical Study of Reuse and Integration of Pre-Trained Models in Open-Source Projects, by Jerin Yasmin and 3 other authors View PDF HTML (experimental) Abstract:Pre-trained models (PTMs) are machine learning models that have been trained in advance, often on large-scale data, and can be reused for new tasks, thereby reducing the need for costly training from scratch. Their widespread adoption introduces a new class of software dependency, which we term Software Dependencies 2.0, extending beyond conventional libraries to learned behaviors embodied in trained models and their associated artifacts. The integration of PTMs as software dependencies in real projects remains unclear, potentially threatening maintainability and reliability of modern software systems that increasingly rely on them. Objective: In this study, we investigate Software Dependencies 2.0 in open-source software (OSS) projects by examining the reuse of PTMs, with a focus on how developers manage and integrate these models. Specifically, we seek to understand: (1) how OSS projects structure and document their PTM dependencies; (...