Natural Language Processing

Text understanding and language tasks

Top This Week

Startup Battlefield 200 applications open until May 27 | TechCrunch
Nlp

Startup Battlefield 200 applications open until May 27 | TechCrunch

Nominate your startup, or one you know, and apply for a chance at VC access, TechCrunch coverage, and $100K for Startup Battlefield 200.

TechCrunch - AI · 4 min ·
[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Llms

[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

arXiv - AI · 4 min ·
[2601.13508] Autonomous Computational Catalysis Research via Agentic Systems
Nlp

[2601.13508] Autonomous Computational Catalysis Research via Agentic Systems

Abstract page for arXiv paper 2601.13508: Autonomous Computational Catalysis Research via Agentic Systems

arXiv - AI · 3 min ·

All Content

[2602.14134] DenseMLLM: Standard Multimodal LLMs are Intrinsic Dense Predictors
Llms

[2602.14134] DenseMLLM: Standard Multimodal LLMs are Intrinsic Dense Predictors

The paper introduces DenseMLLM, a multimodal large language model designed to perform dense predictions without the need for complex, tas...

arXiv - AI · 3 min ·
[2602.13928] voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models
Llms

[2602.13928] voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models

The paper presents voice2mode, a method for classifying four singing phonation modes using self-supervised speech models, demonstrating s...

arXiv - Machine Learning · 3 min ·
[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models
Llms

[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

TabTracer introduces a novel Monte Carlo Tree Search framework for enhancing table reasoning in large language models, improving accuracy...

arXiv - AI · 4 min ·
[2602.14073] Annotation-Efficient Vision-Language Model Adaptation to the Polish Language Using the LLaVA Framework
Llms

[2602.14073] Annotation-Efficient Vision-Language Model Adaptation to the Polish Language Using the LLaVA Framework

This article presents a methodology for adapting vision-language models to the Polish language using the LLaVA framework, demonstrating s...

arXiv - AI · 4 min ·
[2602.14043] Beyond Static Snapshots: Dynamic Modeling and Forecasting of Group-Level Value Evolution with Large Language Models
Llms

[2602.14043] Beyond Static Snapshots: Dynamic Modeling and Forecasting of Group-Level Value Evolution with Large Language Models

This article presents a novel framework for dynamic modeling and forecasting of group-level value evolution using large language models (...

arXiv - AI · 4 min ·
[2602.14009] Named Entity Recognition for Payment Data Using NLP
Machine Learning

[2602.14009] Named Entity Recognition for Payment Data Using NLP

This paper explores Named Entity Recognition (NER) techniques for payment data, presenting advanced models like PaymentBERT that enhance ...

arXiv - AI · 3 min ·
[2602.14002] The Sufficiency-Conciseness Trade-off in LLM Self-Explanation from an Information Bottleneck Perspective
Llms

[2602.14002] The Sufficiency-Conciseness Trade-off in LLM Self-Explanation from an Information Bottleneck Perspective

This paper explores the trade-off between sufficiency and conciseness in self-explanations provided by large language models (LLMs), emph...

arXiv - AI · 3 min ·
[2602.13543] LiveNewsBench: Evaluating LLM Web Search Capabilities with Freshly Curated News
Llms

[2602.13543] LiveNewsBench: Evaluating LLM Web Search Capabilities with Freshly Curated News

The paper introduces LiveNewsBench, a benchmark for evaluating the web search capabilities of Large Language Models (LLMs) using freshly ...

arXiv - Machine Learning · 4 min ·
[2602.13914] Common Knowledge Always, Forever
Machine Learning

[2602.13914] Common Knowledge Always, Forever

The paper discusses a polytopological PDL framework for expressing common knowledge and its implications in epistemic logic, highlighting...

arXiv - AI · 3 min ·
[2602.13515] SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning
Machine Learning

[2602.13515] SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

The paper presents SpargeAttention2, a novel trainable sparse attention method that enhances the efficiency of diffusion models by combin...

arXiv - Machine Learning · 4 min ·
[2602.13513] Learning Gradient Flow: Using Equation Discovery to Accelerate Engineering Optimization
Machine Learning

[2602.13513] Learning Gradient Flow: Using Equation Discovery to Accelerate Engineering Optimization

This paper explores data-driven equation discovery to enhance optimization processes in engineering, introducing the Learned Gradient Flo...

arXiv - Machine Learning · 4 min ·
[2602.13891] GSRM: Generative Speech Reward Model for Speech RLHF
Llms

[2602.13891] GSRM: Generative Speech Reward Model for Speech RLHF

The paper introduces the Generative Speech Reward Model (GSRM), a novel approach to evaluating speech naturalness in AI-generated audio, ...

arXiv - AI · 4 min ·
[2602.13510] Stochastic variance reduced extragradient methods for solving hierarchical variational inequalities
Nlp

[2602.13510] Stochastic variance reduced extragradient methods for solving hierarchical variational inequalities

This paper presents stochastic variance reduced extragradient methods for solving hierarchical variational inequalities, proving converge...

arXiv - Machine Learning · 3 min ·
[2602.13476] AsyncVLA: An Asynchronous VLA for Fast and Robust Navigation on the Edge
Llms

[2602.13476] AsyncVLA: An Asynchronous VLA for Fast and Robust Navigation on the Edge

AsyncVLA introduces an asynchronous control framework for robotic navigation, enhancing real-time performance by decoupling semantic reas...

arXiv - Machine Learning · 3 min ·
[2602.13812] DTBench: A Synthetic Benchmark for Document-to-Table Extraction
Llms

[2602.13812] DTBench: A Synthetic Benchmark for Document-to-Table Extraction

DTBench introduces a synthetic benchmark for evaluating document-to-table extraction capabilities, addressing limitations in existing ben...

arXiv - AI · 4 min ·
[2602.13758] OmniScience: A Large-scale Multi-modal Dataset for Scientific Image Understanding
Llms

[2602.13758] OmniScience: A Large-scale Multi-modal Dataset for Scientific Image Understanding

The paper introduces OmniScience, a large-scale multi-modal dataset designed to enhance scientific image understanding in AI models, addr...

arXiv - AI · 4 min ·
[2602.13704] Pailitao-VL: Unified Embedding and Reranker for Real-Time Multi-Modal Industrial Search
Nlp

[2602.13704] Pailitao-VL: Unified Embedding and Reranker for Real-Time Multi-Modal Industrial Search

The paper presents Pailitao-VL, a multi-modal retrieval system designed for real-time industrial search, addressing key challenges in ret...

arXiv - AI · 4 min ·
[2602.13671] MAS-on-the-Fly: Dynamic Adaptation of LLM-based Multi-Agent Systems at Test Time
Llms

[2602.13671] MAS-on-the-Fly: Dynamic Adaptation of LLM-based Multi-Agent Systems at Test Time

The paper presents MASFly, a novel framework for dynamic adaptation of LLM-based multi-agent systems at test time, enhancing task perform...

arXiv - AI · 3 min ·
[2602.13650] KorMedMCQA-V: A Multimodal Benchmark for Evaluating Vision-Language Models on the Korean Medical Licensing Examination
Llms

[2602.13650] KorMedMCQA-V: A Multimodal Benchmark for Evaluating Vision-Language Models on the Korean Medical Licensing Examination

The article presents KorMedMCQA-V, a benchmark dataset for evaluating vision-language models on the Korean Medical Licensing Examination,...

arXiv - AI · 4 min ·
[2602.13647] PT-RAG: Structure-Fidelity Retrieval-Augmented Generation for Academic Papers
Nlp

[2602.13647] PT-RAG: Structure-Fidelity Retrieval-Augmented Generation for Academic Papers

PT-RAG introduces a novel framework for retrieval-augmented generation that maintains the hierarchical structure of academic papers, impr...

arXiv - AI · 4 min ·
Previous Page 122 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime