[2602.18920] DeepInnovator: Triggering the Innovative Capabilities of LLMs
Summary
DeepInnovator proposes a novel training framework to enhance the innovative capabilities of Large Language Models (LLMs) for scientific research, outperforming existing models in generating original research ideas.
Why It Matters
As LLMs increasingly contribute to scientific discovery, developing frameworks like DeepInnovator is crucial for enabling these models to autonomously generate significant research ideas. This advancement could accelerate innovation across various fields, making research more efficient and impactful.
Key Takeaways
- DeepInnovator introduces a systematic training framework for LLMs.
- The framework includes an automated data extraction pipeline for structured research knowledge.
- It employs a 'Next Idea Prediction' paradigm to iteratively generate and refine research ideas.
- DeepInnovator-14B shows superior performance compared to untrained baselines and matches leading LLMs.
- The dataset will be open-sourced to promote community advancement in research capabilities.
Computer Science > Computation and Language arXiv:2602.18920 (cs) [Submitted on 21 Feb 2026] Title:DeepInnovator: Triggering the Innovative Capabilities of LLMs Authors:Tianyu Fan, Fengji Zhang, Yuxiang Zheng, Bei Chen, Xinyao Niu, Chengen Huang, Junyang Lin, Chao Huang View a PDF of the paper titled DeepInnovator: Triggering the Innovative Capabilities of LLMs, by Tianyu Fan and 7 other authors View PDF HTML (experimental) Abstract:The application of Large Language Models (LLMs) in accelerating scientific discovery has garnered increasing attention, with a key focus on constructing research agents endowed with innovative capability, i.e., the ability to autonomously generate novel and significant research ideas. Existing approaches predominantly rely on sophisticated prompt engineering and lack a systematic training paradigm. To address this, we propose DeepInnovator, a training framework designed to trigger the innovative capability of LLMs. Our approach comprises two core components. (1) ``Standing on the shoulders of giants''. We construct an automated data extraction pipeline to extract and organize structured research knowledge from a vast corpus of unlabeled scientific literature. (2) ``Conjectures and refutations''. We introduce a ``Next Idea Prediction'' training paradigm, which models the generation of research ideas as an iterative process of continuously predicting, evaluating, and refining plausible and novel next idea. Both automatic and expert evaluations de...