[2509.24496] LLM DNA: Tracing Model Evolution via Functional Representations

[2509.24496] LLM DNA: Tracing Model Evolution via Functional Representations

arXiv - AI 4 min read Article

Summary

The paper 'LLM DNA' explores the evolutionary relationships of large language models (LLMs) through a novel mathematical representation, addressing challenges in model management and comparison.

Why It Matters

As the landscape of large language models expands, understanding their evolutionary connections is crucial for effective management and development. This research offers a new framework that enhances clarity and facilitates better model comparisons, which is essential for advancing AI technology.

Key Takeaways

  • Introduces 'LLM DNA' as a mathematical representation of LLM behavior.
  • Proves that LLM DNA exhibits properties of inheritance and genetic determinism.
  • Develops a scalable, training-free pipeline for DNA extraction from LLMs.
  • Constructs an evolutionary tree of LLMs, revealing undocumented relationships.
  • Demonstrates superior performance of LLM DNA in specific tasks compared to existing methods.

Computer Science > Machine Learning arXiv:2509.24496 (cs) [Submitted on 29 Sep 2025 (v1), last revised 15 Feb 2026 (this version, v2)] Title:LLM DNA: Tracing Model Evolution via Functional Representations Authors:Zhaomin Wu, Haodong Zhao, Ziyang Wang, Jizhou Guo, Qian Wang, Bingsheng He View a PDF of the paper titled LLM DNA: Tracing Model Evolution via Functional Representations, by Zhaomin Wu and 5 other authors View PDF HTML (experimental) Abstract:The explosive growth of large language models (LLMs) has created a vast but opaque landscape: millions of models exist, yet their evolutionary relationships through fine-tuning, distillation, or adaptation are often undocumented or unclear, complicating LLM management. Existing methods are limited by task specificity, fixed model sets, or strict assumptions about tokenizers or architectures. Inspired by biological DNA, we address these limitations by mathematically defining LLM DNA as a low-dimensional, bi-Lipschitz representation of functional behavior. We prove that LLM DNA satisfies inheritance and genetic determinism properties and establish the existence of DNA. Building on this theory, we derive a general, scalable, training-free pipeline for DNA extraction. In experiments across 305 LLMs, DNA aligns with prior studies on limited subsets and achieves superior or competitive performance on specific tasks. Beyond these tasks, DNA comparisons uncover previously undocumented relationships among LLMs. We further construct th...

Related Articles

Anthropic Claude AI training model targets AI skills gap | ETIH EdTech News
Llms

Anthropic Claude AI training model targets AI skills gap | ETIH EdTech News

AI in education, edtech AI tools, and AI skills training drive Anthropic’s Claude curriculum. ETIH edtech news covers how AI fluency, wor...

AI Tools & Products · 6 min ·
I use ChatGPT every day — I stick to these 3 rules to protect my privacy
Llms

I use ChatGPT every day — I stick to these 3 rules to protect my privacy

I stick to three essential rules whenever I open up a new chat in ChatGPT to always protect my privacy and keep my data secure

AI Tools & Products · 9 min ·
Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute
Llms

Anthropic expands partnership with Google and Broadcom for multiple gigawatts of next-generation compute

AI Tools & Products · 3 min ·
Llms

Codex and Claude Code Can Work Together

AI Tools & Products ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime