Natural Language Processing

Text understanding and language tasks

Top This Week

[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing
Llms

[2603.24326] Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

Abstract page for arXiv paper 2603.24326: Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing

arXiv - AI · 4 min ·
[2601.13508] Autonomous Computational Catalysis Research via Agentic Systems
Nlp

[2601.13508] Autonomous Computational Catalysis Research via Agentic Systems

Abstract page for arXiv paper 2601.13508: Autonomous Computational Catalysis Research via Agentic Systems

arXiv - AI · 3 min ·
[2510.20847] Integrated representational signatures strengthen specificity in brains and models
Machine Learning

[2510.20847] Integrated representational signatures strengthen specificity in brains and models

Abstract page for arXiv paper 2510.20847: Integrated representational signatures strengthen specificity in brains and models

arXiv - AI · 4 min ·

All Content

[2602.14615] VariViT: A Vision Transformer for Variable Image Sizes
Machine Learning

[2602.14615] VariViT: A Vision Transformer for Variable Image Sizes

The paper introduces VariViT, a Vision Transformer designed to effectively handle variable image sizes, improving feature representation ...

arXiv - AI · 4 min ·
[2602.14612] LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio
Llms

[2602.14612] LongAudio-RAG: Event-Grounded Question Answering over Multi-Hour Long Audio

The paper presents LongAudio-RAG, a framework for event-grounded question answering over lengthy audio recordings, enhancing accuracy thr...

arXiv - Machine Learning · 4 min ·
[2602.14517] Beyond Translation: Evaluating Mathematical Reasoning Capabilities of LLMs in Sinhala and Tamil
Llms

[2602.14517] Beyond Translation: Evaluating Mathematical Reasoning Capabilities of LLMs in Sinhala and Tamil

This article evaluates the mathematical reasoning capabilities of large language models (LLMs) in Sinhala and Tamil, revealing significan...

arXiv - Machine Learning · 4 min ·
[2602.14498] Uncertainty-Aware Vision-Language Segmentation for Medical Imaging
Machine Learning

[2602.14498] Uncertainty-Aware Vision-Language Segmentation for Medical Imaging

This paper presents a novel uncertainty-aware multimodal segmentation framework that integrates radiological images and clinical text to ...

arXiv - Machine Learning · 4 min ·
[2602.14406] TruthStance: An Annotated Dataset of Conversations on Truth Social
Data Science

[2602.14406] TruthStance: An Annotated Dataset of Conversations on Truth Social

TruthStance introduces a comprehensive dataset of conversations from Truth Social, focusing on argument mining and stance detection, with...

arXiv - AI · 3 min ·
[2602.14374] Differentially Private Retrieval-Augmented Generation
Llms

[2602.14374] Differentially Private Retrieval-Augmented Generation

The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addr...

arXiv - AI · 4 min ·
[2602.14367] InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem
Llms

[2602.14367] InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem

The paper introduces InnoEval, a framework for evaluating research ideas using knowledge-grounded, multi-perspective reasoning, addressin...

arXiv - AI · 4 min ·
[2602.14345] AXE: An Agentic eXploit Engine for Confirming Zero-Day Vulnerability Reports
Nlp

[2602.14345] AXE: An Agentic eXploit Engine for Confirming Zero-Day Vulnerability Reports

The paper presents AXE, an innovative framework for validating zero-day vulnerabilities using minimal metadata, achieving a significant i...

arXiv - AI · 4 min ·
[2602.14358] High Precision Audience Expansion via Extreme Classification in a Two-Sided Marketplace
Machine Learning

[2602.14358] High Precision Audience Expansion via Extreme Classification in a Two-Sided Marketplace

This paper discusses a novel approach to audience expansion in a two-sided marketplace, focusing on high precision retrieval methods for ...

arXiv - AI · 3 min ·
[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
Machine Learning

[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

The paper presents STATe-of-Thoughts, a new method for improving output diversity and interpretability in inference-time compute methods,...

arXiv - Machine Learning · 4 min ·
[2602.14237] AbracADDbra: Touch-Guided Object Addition by Decoupling Placement and Editing Subtasks
Machine Learning

[2602.14237] AbracADDbra: Touch-Guided Object Addition by Decoupling Placement and Editing Subtasks

The paper presents AbracADDbra, a framework that enhances object addition in computer vision by decoupling placement and editing tasks th...

arXiv - AI · 3 min ·
[2602.14216] Reasoning Language Models for complex assessments tasks: Evaluating parental cooperation from child protection case reports
Llms

[2602.14216] Reasoning Language Models for complex assessments tasks: Evaluating parental cooperation from child protection case reports

This article explores the effectiveness of reasoning language models (RLMs) in assessing parental cooperation during child protection int...

arXiv - AI · 4 min ·
[2602.14172] Investigation for Relative Voice Impression Estimation
Ai Startups

[2602.14172] Investigation for Relative Voice Impression Estimation

This article explores Relative Voice Impression Estimation (RIE), focusing on how different speech modeling approaches affect listener pe...

arXiv - Machine Learning · 3 min ·
[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler
Machine Learning

[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler

The paper introduces the Gaussian Thought Sampler (GTS), a novel approach to inference-time scaling in latent reasoning models, enhancing...

arXiv - Machine Learning · 3 min ·
[2602.14189] Knowing When Not to Answer: Abstention-Aware Scientific Reasoning
Llms

[2602.14189] Knowing When Not to Answer: Abstention-Aware Scientific Reasoning

The paper discusses an abstention-aware framework for scientific reasoning, emphasizing the importance of knowing when to abstain from an...

arXiv - AI · 4 min ·
[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models
Machine Learning

[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models

The paper presents Spherical Barycentric Aggregation (SBA), a new method for aggregating outputs in Mixture-of-Experts (MoE) embedding mo...

arXiv - Machine Learning · 3 min ·
[2602.14188] GPT-5 vs Other LLMs in Long Short-Context Performance
Llms

[2602.14188] GPT-5 vs Other LLMs in Long Short-Context Performance

This paper evaluates the performance of GPT-5 and other LLMs on long short-context tasks, revealing significant gaps between theoretical ...

arXiv - AI · 4 min ·
[2602.14030] MC$^2$Mark: Distortion-Free Multi-Bit Watermarking for Long Messages
Llms

[2602.14030] MC$^2$Mark: Distortion-Free Multi-Bit Watermarking for Long Messages

MC$^2$Mark introduces a novel watermarking framework that ensures reliable embedding of long messages in generated text while maintaining...

arXiv - Machine Learning · 3 min ·
[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing
Llms

[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing

This article presents a multi-agent framework for medical AI that enhances clinical query processing by leveraging fine-tuned language mo...

arXiv - AI · 4 min ·
[2602.14134] DenseMLLM: Standard Multimodal LLMs are Intrinsic Dense Predictors
Llms

[2602.14134] DenseMLLM: Standard Multimodal LLMs are Intrinsic Dense Predictors

The paper introduces DenseMLLM, a multimodal large language model designed to perform dense predictions without the need for complex, tas...

arXiv - AI · 3 min ·
Previous Page 121 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime