Natural Language Processing

Text understanding and language tasks

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] 94.42% on BANKING77 Official Test Split with Lightweight Embedding + Example Reranking (strict full-train protocol)

BANKING77 (77 fine-grained banking intents) is a well-established but increasingly saturated intent classification benchmark. did this wh...

Reddit - Machine Learning · 1 min · about 6 hours ago

Llms

94.42% on BANKING77 Official Test Split — New Strong 2nd Place with Lightweight Embedding + Rerank (no 7B LLM)

94.42% Accuracy on Banking77 Official Test Split BANKING77-77 is deceptively hard: 77 fine-grained banking intents, noisy real-world quer...

Reddit - Artificial Intelligence · 1 min · about 6 hours ago

Nlp

Built a Hybrid NAS tool for RNN architectures (HyNAS-R) – Looking for feedback for my final year evaluation [R]

Hi everyone, I'm currently in the evaluation phase of my Final Year Project and am looking for feedback on the system I've built. It's ca...

Reddit - Machine Learning · 1 min · about 8 hours ago

All Content

Llms

[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis

The paper presents MLLM-CTBench, a benchmark for continual instruction tuning of multimodal large language models, addressing the need fo...

arXiv - AI · 4 min · about 2 months ago

Llms

[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

This article introduces the Haerae Evaluation Toolkit (HRET), a unified framework for evaluating the capabilities of Korean language mode...

arXiv - AI · 4 min · about 2 months ago

Llms

[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving

The paper presents PlanetServe, a decentralized overlay for scalable and privacy-preserving serving of large language models (LLMs), addr...

arXiv - AI · 4 min · about 2 months ago

Nlp

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

The paper presents Retreever, a tree-based hierarchical retrieval method that enhances efficiency and transparency in information retriev...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

This paper explores compressible dynamics in deep overparameterized low-rank learning, presenting methods to enhance training efficiency ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

The paper introduces Cross-Attention Token Pruning (CATP), a method designed to enhance the accuracy of multimodal models by effectively ...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

This paper introduces Selective Abstraction (SA), a framework for improving the reliability of long-form text generated by LLMs by select...

arXiv - Machine Learning · 4 min · about 2 months ago

Nlp

[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge

The paper proposes a novel approach for enhancing domain-specific knowledge graphs (DKGs) by integrating general knowledge graphs (GKGs) ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study

This study explores the use of fine-tuned large language models for automated depression screening in Nigerian Pidgin English, addressing...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

The paper presents RLIE, a framework that integrates large language models (LLMs) with probabilistic rule learning to enhance rule genera...

arXiv - AI · 4 min · about 2 months ago

Llms

[2510.07978] VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

The paper introduces VoiceAgentBench, a benchmark for evaluating voice assistants' capabilities in agentic tasks, highlighting their perf...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

The paper presents SCAN, a novel approach for Semantic Document Layout Analysis that enhances Retrieval-Augmented Generation (RAG) system...

arXiv - AI · 4 min · about 2 months ago

Llms

[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

The paper presents SaVe-TAG, a novel framework that utilizes Large Language Models for semantic-aware interpolation in long-tailed text-a...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.13194] Semantic Chunking and the Entropy of Natural Language

This article presents a statistical model for semantic chunking in natural language, revealing insights into the entropy of English and i...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

The paper presents CoPE-VideoLM, a novel approach that utilizes codec primitives to enhance the efficiency of video language models, sign...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

The paper introduces Krites, an asynchronous caching policy for large language models (LLMs) that enhances semantic caching efficiency wh...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

This study investigates the reliability of AI in detecting cognitive impairment among multilingual English speakers in the UK, revealing ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

This article presents a novel meta-cognitive framework aimed at enhancing knowledge augmentation in Large Language Models (LLMs), address...

arXiv - AI · 3 min · about 2 months ago

Robotics

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

This article examines how the availability of knowledge influences the persuasiveness of generative social agents (GSAs) in physiotherapy...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

The paper presents RADAR, a novel evaluation framework for Multi-modal Large Language Models (MLLMs) that addresses performance bottlenec...

arXiv - AI · 4 min · about 2 months ago

Previous Page 130 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Natural Language Processing

Top This Week

[R] 94.42% on BANKING77 Official Test Split with Lightweight Embedding + Example Reranking (strict full-train protocol)

94.42% on BANKING77 Official Test Split — New Strong 2nd Place with Lightweight Embedding + Rerank (no 7B LLM)

Built a Hybrid NAS tool for RNN architectures (HyNAS-R) – Looking for feedback for my final year evaluation [R]

All Content

[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis

[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge

[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

[2510.07978] VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

[2602.13194] Semantic Chunking and the Entropy of Natural Language

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

Related Topics

Stay updated with AI News