[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Text understanding and language tasks
Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Abstract page for arXiv paper 2601.13508: Autonomous Computational Catalysis Research via Agentic Systems
Abstract page for arXiv paper 2510.20847: Integrated representational signatures strengthen specificity in brains and models
This article presents PrinMix, a new SVD-based framework for enhancing delta compression in large language models (LLMs), addressing stor...
This paper introduces a novel approach using the Cramér-von Mises statistic to create incentive mechanisms that promote truthful data sha...
The paper presents SAFER, a two-stage risk control framework for large language models (LLMs) that enhances output trustworthiness in ris...
This article explores how decoder-only language models engage in internal planning, focusing on their ability to organize computations fo...
This paper presents a novel approach to prevent negative transfer in transfer learning by integrating residual features from pretrained m...
This article presents Robust Multi-Objective Decoding (RMOD), an innovative algorithm designed to enhance the performance of Large Langua...
The paper presents LO-BCQ, a novel block clustered quantization method for 4-bit LLM inference, achieving less than 1% accuracy loss whil...
This paper presents Autoregressive Noisy Filtration Modeling (ANFM), a new framework for fast graph generation that balances quality and ...
The paper presents VIPA, a novel framework for Referring Image Segmentation that enhances attention mechanisms by leveraging informative ...
This paper explores hallucinations in small-sized language models (LLMs) through a geometric lens, demonstrating that genuine responses c...
The paper evaluates the impact of reasoning-oriented large language models on machine translation, revealing that explicit reasoning ofte...
Orcheo is an open-source platform designed to streamline conversational search by offering a modular architecture, production-ready infra...
This article presents a novel Distributed Quantum Gaussian Process (DQGP) method for multi-agent systems, enhancing modeling capabilities...
The paper introduces Drift-Diffusion Matching, a framework for training recurrent neural networks (RNNs) to model complex stochastic dyna...
This paper introduces a multi-dimensional persistent sheaf Laplacian (MPSL) framework for image analysis, enhancing dimensionality reduct...
The paper presents XTF, an explainable token-level noise filtering framework designed to enhance the fine-tuning of Large Language Models...
This study evaluates the effectiveness of pre-trained embeddings in machine-guided protein design, focusing on predicting AAV vector viab...
This article presents the BETA-labeling framework for constructing a Bangla IR dataset, addressing challenges in low-resource languages a...
GOT-JEPA introduces a novel framework for generic object tracking that enhances model adaptation and occlusion handling, improving robust...
The paper presents CoCoDiff, a novel framework for fine-grained style transfer in images, emphasizing semantic correspondence and achievi...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime