Natural Language Processing

Text understanding and language tasks

Top This Week

[2603.17205] OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation
Machine Learning

[2603.17205] OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

Abstract page for arXiv paper 2603.17205: OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

arXiv - Machine Learning · 4 min ·
[2512.02418] Leveraging Large Language Models to Bridge Cross-Domain Transparency in Stablecoins
Llms

[2512.02418] Leveraging Large Language Models to Bridge Cross-Domain Transparency in Stablecoins

Abstract page for arXiv paper 2512.02418: Leveraging Large Language Models to Bridge Cross-Domain Transparency in Stablecoins

arXiv - Machine Learning · 4 min ·
[2510.14377] PluriHopRAG: Exhaustive, Recall-Sensitive QA Through Corpus-Specific Document Structure Learning
Nlp

[2510.14377] PluriHopRAG: Exhaustive, Recall-Sensitive QA Through Corpus-Specific Document Structure Learning

Abstract page for arXiv paper 2510.14377: PluriHopRAG: Exhaustive, Recall-Sensitive QA Through Corpus-Specific Document Structure Learning

arXiv - Machine Learning · 4 min ·

All Content

[2510.05725] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies
Llms

[2510.05725] Improving Discrete Diffusion Unmasking Policies Beyond Explicit Reference Policies

This article presents a novel approach to improving masked diffusion models (MDMs) for language modeling by introducing a learned schedul...

arXiv - Machine Learning · 4 min ·
[2509.21936] Statistical Advantage of Softmax Attention: Insights from Single-Location Regression
Llms

[2509.21936] Statistical Advantage of Softmax Attention: Insights from Single-Location Regression

This article explores the statistical advantages of softmax attention mechanisms in large language models, particularly in single-locatio...

arXiv - Machine Learning · 4 min ·
[2602.23286] SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables
Machine Learning

[2602.23286] SPARTA: Scalable and Principled Benchmark of Tree-Structured Multi-hop QA over Text and Tables

The paper presents SPARTA, a novel framework for generating scalable benchmarks for tree-structured multi-hop question answering (QA) ove...

arXiv - AI · 4 min ·
[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model
Llms

[2509.21013] Predicting LLM Reasoning Performance with Small Proxy Model

This article presents rBridge, a small proxy model that predicts reasoning performance in large language models (LLMs), demonstrating sig...

arXiv - Machine Learning · 4 min ·
[2602.23228] MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction
Llms

[2602.23228] MovieTeller: Tool-augmented Movie Synopsis with ID Consistent Progressive Abstraction

The paper presents MovieTeller, a novel framework for generating movie synopses using tool-augmented progressive abstraction to enhance c...

arXiv - AI · 4 min ·
[2602.23225] Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?
Llms

[2602.23225] Why Diffusion Language Models Struggle with Truly Parallel (Non-Autoregressive) Decoding?

This paper investigates why Diffusion Language Models (DLMs) often default to autoregressive decoding instead of utilizing their potentia...

arXiv - AI · 4 min ·
[2506.14261] RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?
Llms

[2506.14261] RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?

This article explores RL-Obfuscation, a method for training language models to evade latent-space monitors that detect undesirable behavi...

arXiv - Machine Learning · 4 min ·
[2602.23071] Quantity Convergence, Quality Divergence: Disentangling Fluency and Accuracy in L2 Mandarin Prosody
Nlp

[2602.23071] Quantity Convergence, Quality Divergence: Disentangling Fluency and Accuracy in L2 Mandarin Prosody

This study examines the relationship between fluency and accuracy in L2 Mandarin prosody, revealing that while learners may achieve quant...

arXiv - AI · 3 min ·
[2602.23070] Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment
Ai Safety

[2602.23070] Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment

This paper presents a novel approach to long-form Bengali Automatic Speech Recognition (ASR) and speaker diarization, introducing a compr...

arXiv - AI · 4 min ·
[2602.23057] Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention
Machine Learning

[2602.23057] Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

The paper introduces Affine-Scaled Attention, a novel approach to Transformer attention that enhances flexibility and stability by modify...

arXiv - AI · 4 min ·
[2502.06051] Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits
Nlp

[2502.06051] Towards a Sharp Analysis of Offline Policy Learning for $f$-Divergence-Regularized Contextual Bandits

This paper presents a detailed analysis of offline policy learning in contextual bandits, focusing on $f$-divergence regularization and i...

arXiv - Machine Learning · 4 min ·
[2602.22967] Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression
Llms

[2602.22967] Discovery of Interpretable Physical Laws in Materials via Language-Model-Guided Symbolic Regression

This paper presents a novel framework that utilizes language models to guide symbolic regression in discovering interpretable physical la...

arXiv - AI · 3 min ·
[2602.22935] A Holistic Framework for Robust Bangla ASR and Speaker Diarization with Optimized VAD and CTC Alignment
Machine Learning

[2602.22935] A Holistic Framework for Robust Bangla ASR and Speaker Diarization with Optimized VAD and CTC Alignment

This paper presents a robust framework for Bangla Automatic Speech Recognition (ASR) and Speaker Diarization, addressing challenges in pr...

arXiv - AI · 3 min ·
[2602.22873] Learning Tangent Bundles and Characteristic Classes with Autoencoder Atlases
Nlp

[2602.22873] Learning Tangent Bundles and Characteristic Classes with Autoencoder Atlases

This paper introduces a framework connecting multi-chart autoencoders with vector bundles and characteristic classes, enhancing manifold ...

arXiv - AI · 3 min ·
[2602.22871] Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching
Llms

[2602.22871] Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching

The paper presents a novel framework called Stitching Noisy Diffusion Thoughts, which enhances reasoning in large language models by comb...

arXiv - AI · 4 min ·
[2602.23312] Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction
Llms

[2602.23312] Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

This paper evaluates the effectiveness of small language models (SLMs) in leader-follower interactions, comparing zero-shot and one-shot ...

arXiv - Machine Learning · 4 min ·
[2602.23295] ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation
Machine Learning

[2602.23295] ManifoldGD: Training-Free Hierarchical Manifold Guidance for Diffusion-Based Dataset Distillation

The paper presents ManifoldGD, a training-free framework for dataset distillation using hierarchical manifold guidance, improving efficie...

arXiv - Machine Learning · 4 min ·
[2602.22828] TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought
Llms

[2602.22828] TCM-DiffRAG: Personalized Syndrome Differentiation Reasoning Method for Traditional Chinese Medicine based on Knowledge Graph and Chain of Thought

The article presents TCM-DiffRAG, a novel reasoning framework for Traditional Chinese Medicine (TCM) that enhances diagnosis through know...

arXiv - AI · 4 min ·
[2602.23234] Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments
Llms

[2602.23234] Scaling Search Relevance: Augmenting App Store Ranking with LLM-Generated Judgments

This article discusses a novel approach to enhancing app store ranking by integrating LLM-generated textual relevance labels with behavio...

arXiv - Machine Learning · 4 min ·
[2602.23197] Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models
Llms

[2602.23197] Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

This paper explores the impact of fine-tuning on in-context learning in linear attention models, revealing conditions that can enhance or...

arXiv - Machine Learning · 3 min ·
Previous Page 65 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime