[2602.23632] MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

[2602.23632] MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

arXiv - AI 3 min read

About this article

Abstract page for arXiv paper 2602.23632: MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs

Computer Science > Artificial Intelligence arXiv:2602.23632 (cs) [Submitted on 27 Feb 2026] Title:MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs Authors:Lun Zhan, Feng Xiong, Huanyong Liu, Feng Zhang, Yuhui Yin View a PDF of the paper titled MMKG-RDS: Reasoning Data Synthesis via Deep Mining of Multimodal Knowledge Graphs, by Lun Zhan and 4 other authors View PDF HTML (experimental) Abstract:Synthesizing high-quality training data is crucial for enhancing domain models' reasoning abilities. Existing methods face limitations in long-tail knowledge coverage, effectiveness verification, and interpretability. Knowledge-graph-based approaches still fall short in functionality, granularity, customizability, and evaluation. To address these issues, we propose MMKG-RDS, a flexible framework for reasoning data synthesis that leverages multimodal knowledge graphs. It supports fine-grained knowledge extraction, customizable path sampling, and multidimensional data quality scoring. We validate MMKG-RDS with the MMKG-RDS-Bench dataset, covering five domains, 17 task types, and 14,950 samples. Experimental results show fine-tuning Qwen3 models (0.6B/8B/32B) on a small number of synthesized samples improves reasoning accuracy by 9.2%. The framework also generates distinct data, challenging existing models on tasks involving tables and formulas, useful for complex benchmark construction. The dataset and code are available at this https URL Subjects: Arti...

Originally published on March 02, 2026. Curated by AI News.

Related Articles

Machine Learning

I tried building a memory-first AI… and ended up discovering smaller models can beat larger ones

Dataset Model Acc F1 Δ vs Log Δ vs Static Avg Params Peak Params Steps Infer ms Size Banking77-20 Logistic TF-IDF 92.37% 0.9230 +0.00pp +...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] Run Karpathy's Autoresearch for $0.44 instead of $24 — Open-source parallel evolution pipeline on SageMaker Spot

TL;DR: I built an open-source pipeline that runs Karpathy's autoresearch on SageMaker Spot instances — 25 autonomous ML experiments for $...

Reddit - Machine Learning · 1 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime