[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Text understanding and language tasks
Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Abstract page for arXiv paper 2601.13508: Autonomous Computational Catalysis Research via Agentic Systems
Abstract page for arXiv paper 2510.20847: Integrated representational signatures strengthen specificity in brains and models
This paper presents a novel framework for predicting low-altitude network coverage using disentangled representation learning, addressing...
ExtractBench introduces a benchmark and evaluation framework for extracting structured data from unstructured documents like PDFs, addres...
The paper presents dnaHNet, a novel tokenizer-free autoregressive model designed for genomic sequence learning, achieving significant eff...
This paper presents methods for distilling privileged information in language models, focusing on improving performance in multi-turn env...
This article presents a novel graph transformer model, incorporating cardinality-preserving attention channels, to enhance molecular prop...
This paper examines the relationship between behavioral and hidden-state semantic geometry in large language models (LLMs) through psycho...
This article presents ChemRAG-Bench, a benchmark for evaluating retrieval-augmented generation (RAG) in chemistry, demonstrating signific...
The paper introduces memory recurrent units (MRUs), a new family of RNNs that combine persistent memory with parallelizable computations,...
ModSSC is an open-source Python framework designed for semi-supervised classification, enhancing reproducibility and experimentation acro...
RapidPen is a novel automated penetration testing framework that utilizes large language models to autonomously exploit vulnerabilities, ...
This article presents One-Shot Dynamic Thresholding (OSDT) for diffusion language models, enhancing decoding efficiency and accuracy by c...
The paper explores how algorithmic primitives and compositional geometry can enhance reasoning capabilities in large language models (LLM...
The paper presents RACE Attention, a novel linear-time attention mechanism designed for long-sequence training, significantly improving e...
This paper investigates the optimal placement of PDE diffusion layers in transformer architectures, revealing that their insertion order ...
OpenTSLM introduces a new family of Time Series Language Models designed to enhance reasoning over multivariate medical data, outperformi...
The paper introduces ScholarGym, an evaluation environment designed to benchmark large language models in the information-gathering phase...
The paper presents Aeon, a Neuro-Symbolic Cognitive Operating System designed to enhance memory management in Long-Horizon LLM agents, ad...
The paper presents a novel approach to language modeling by introducing token order prediction (TOP) as an improvement over traditional n...
The paper introduces Zono-Conformal Prediction, a method for uncertainty quantification in regression and classification tasks that impro...
The paper presents HCLA, a human-centered multi-agent system designed for detecting anomalies in digital asset transactions, enhancing in...
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime