[2602.01701] Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data
About this article
Abstract page for arXiv paper 2602.01701: Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data
Computer Science > Databases arXiv:2602.01701 (cs) [Submitted on 2 Feb 2026 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data Authors:Ruyu Li, Tinghui Zhang, Haodi Ma, Daisy Zhe Wang, Yifan Wang View a PDF of the paper titled Beyond Single-Modal Analytics: A Framework for Integrating Heterogeneous LLM-Based Query Systems for Multi-Modal Data, by Ruyu Li and 4 other authors View PDF HTML (experimental) Abstract:With the increasing use of multi-modal data, semantic query has become more and more demanded in data management systems, which is an important way to access and analyze multi-modal data. As unstructured data, most information of multi-modal data (text, image, video, etc.) hides in the semantics, which cannot be accessed by traditional database queries like SQL. Given the power of Large Language Models (LLMs) in understanding semantics and processing natural language, in recent years several LLM-based semantic query systems have been proposed to support semantic querying over unstructured data. However, this rapid growth has produced a fragmented ecosystem. Applications face significant integration challenges due to (1) disparate APIs of different semantic query systems and (2) a fundamental trade-off between specialization and generality. Many semantic query systems are highly specialized, offering state-of-the-art performance within a single m...