[2603.00051] LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks
About this article
Abstract page for arXiv paper 2603.00051: LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks
Computer Science > Digital Libraries arXiv:2603.00051 (cs) [Submitted on 10 Feb 2026] Title:LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks Authors:Andreas Varvarigos, Ali Maatouk, Jiasheng Zhang, Ngoc Bui, Jialin Chen, Leandros Tassiulas, Rex Ying View a PDF of the paper titled LitBench: A Graph-Centric Large Language Model Benchmarking Tool For Literature Tasks, by Andreas Varvarigos and 6 other authors View PDF Abstract:While large language models (LLMs) have become the de facto framework for literature-related tasks, they still struggle to function as domain-specific literature agents due to their inability to connect pieces of knowledge and reason across domain-specific contexts, terminologies, and nomenclatures. This challenge underscores the need for a tool that facilitates such domain-specific adaptation and enables rigorous benchmarking across literature tasks. To that end, we introduce LitBench, a benchmarking tool designed to enable the development and evaluation of domain-specific LLMs tailored to literature-related tasks. At its core, LitBench uses a data curation process that generates domain-specific literature sub-graphs and constructs training and evaluation datasets based on the textual attributes of the resulting nodes and edges. The tool is designed for flexibility, supporting the curation of literature graphs across any domain chosen by the user, whether high-level fields or specialized interdisciplinary areas. In ...