[2602.14285] FMMD: A multimodal open peer review dataset based on F1000Research
Summary
The paper introduces FMMD, a multimodal open peer review dataset from F1000Research, addressing limitations in current datasets by integrating visual, structural, and reviewer data for enhanced analysis of the peer review process.
Why It Matters
FMMD is significant as it fills a critical gap in peer review research by providing a comprehensive dataset that includes visual and structural elements alongside reviewer comments. This enables more nuanced analyses of the peer review lifecycle, which is essential for improving automated scholarly review systems and understanding the evolution of scientific manuscripts.
Key Takeaways
- FMMD integrates visual and structural data with reviewer comments.
- It addresses the limitations of existing text-centric peer review datasets.
- The dataset supports tasks like multimodal issue detection and comment generation.
- FMMD enhances the understanding of the peer review lifecycle across disciplines.
- It provides a valuable resource for developing automated peer review systems.
Computer Science > Digital Libraries arXiv:2602.14285 (cs) [Submitted on 15 Feb 2026] Title:FMMD: A multimodal open peer review dataset based on F1000Research Authors:Zhenzhen Zhuang, Yuqing Fu, Jing Zhu, Zhangping Zhou, Jialiang Lin View a PDF of the paper titled FMMD: A multimodal open peer review dataset based on F1000Research, by Zhenzhen Zhuang and 4 other authors View PDF HTML (experimental) Abstract:Automated scholarly paper review (ASPR) has entered the coexistence phase with traditional peer review, where artificial intelligence (AI) systems are increasingly incorporated into real-world manuscript evaluation. In parallel, research on automated and AI-assisted peer review has proliferated. Despite this momentum, empirical progress remains constrained by several critical limitations in existing datasets. While reviewers routinely evaluate figures, tables, and complex layouts to assess scientific claims, most existing datasets remain overwhelmingly text-centric. This bias is reinforced by a narrow focus on data from computer science venues. Furthermore, these datasets lack precise alignment between reviewer comments and specific manuscript versions, obscuring the iterative relationship between peer review and manuscript evolution. In response, we introduce FMMD, a multimodal and multidisciplinary open peer review dataset curated from F1000Research. The dataset bridges the current gap by integrating manuscript-level visual and structural data with version-specific rev...