Natural Language Processing

Text understanding and language tasks

Top This Week

Llms

[R] 94.42% on BANKING77 Official Test Split with Lightweight Embedding + Example Reranking (strict full-train protocol)

BANKING77 (77 fine-grained banking intents) is a well-established but increasingly saturated intent classification benchmark. did this wh...

Reddit - Machine Learning · 1 min ·
Llms

94.42% on BANKING77 Official Test Split — New Strong 2nd Place with Lightweight Embedding + Rerank (no 7B LLM)

94.42% Accuracy on Banking77 Official Test Split BANKING77-77 is deceptively hard: 77 fine-grained banking intents, noisy real-world quer...

Reddit - Artificial Intelligence · 1 min ·
Nlp

Built a Hybrid NAS tool for RNN architectures (HyNAS-R) – Looking for feedback for my final year evaluation [R]

Hi everyone, I'm currently in the evaluation phase of my Final Year Project and am looking for feedback on the system I've built. It's ca...

Reddit - Machine Learning · 1 min ·

All Content

[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis
Llms

[2508.08275] MLLM-CTBench: A Benchmark for Continual Instruction Tuning with Reasoning Process Diagnosis

The paper presents MLLM-CTBench, a benchmark for continual instruction tuning of multimodal large language models, addressing the need fo...

arXiv - AI · 4 min ·
[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models
Llms

[2503.22968] Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

This article introduces the Haerae Evaluation Toolkit (HRET), a unified framework for evaluating the capabilities of Korean language mode...

arXiv - AI · 4 min ·
[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving
Llms

[2504.20101] PlanetServe: A Decentralized, Scalable, and Privacy-Preserving Overlay for Democratizing Large Language Model Serving

The paper presents PlanetServe, a decentralized overlay for scalable and privacy-preserving serving of large language models (LLMs), addr...

arXiv - AI · 4 min ·
[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency
Nlp

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

The paper presents Retreever, a tree-based hierarchical retrieval method that enhances efficiency and transparency in information retriev...

arXiv - Machine Learning · 4 min ·
[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Machine Learning

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

This paper explores compressible dynamics in deep overparameterized low-rank learning, presenting methods to enhance training efficiency ...

arXiv - Machine Learning · 4 min ·
[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference
Machine Learning

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

The paper introduces Cross-Attention Token Pruning (CATP), a method designed to enhance the accuracy of multimodal models by effectively ...

arXiv - AI · 3 min ·
[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation
Llms

[2602.11908] When Should LLMs Be Less Specific? Selective Abstraction for Reliable Long-Form Text Generation

This paper introduces Selective Abstraction (SA), a framework for improving the reliability of long-form text generated by LLMs by select...

arXiv - Machine Learning · 4 min ·
[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge
Nlp

[2601.10485] Panning for Gold: Expanding Domain-Specific Knowledge Graphs with General Knowledge

The paper proposes a novel approach for enhancing domain-specific knowledge graphs (DKGs) by integrating general knowledge graphs (GKGs) ...

arXiv - AI · 4 min ·
[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study
Llms

[2601.00004] Finetuning Large Language Models for Automated Depression Screening in Nigerian Pidgin English: GENSCORE Pilot Study

This study explores the use of fine-tuned large language models for automated depression screening in Nigerian Pidgin English, addressing...

arXiv - Machine Learning · 4 min ·
[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models
Llms

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

The paper presents RLIE, a framework that integrates large language models (LLMs) with probabilistic rule learning to enhance rule genera...

arXiv - AI · 4 min ·
[2510.07978] VoiceAgentBench: Are Voice Assistants ready for agentic tasks?
Llms

[2510.07978] VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

The paper introduces VoiceAgentBench, a benchmark for evaluating voice assistants' capabilities in agentic tasks, highlighting their perf...

arXiv - Machine Learning · 4 min ·
[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation
Llms

[2505.14381] SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

The paper presents SCAN, a novel approach for Semantic Document Layout Analysis that enhances Retrieval-Augmented Generation (RAG) system...

arXiv - AI · 4 min ·
[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs
Llms

[2410.16882] SaVe-TAG: LLM-based Interpolation for Long-Tailed Text-Attributed Graphs

The paper presents SaVe-TAG, a novel framework that utilizes Large Language Models for semantic-aware interpolation in long-tailed text-a...

arXiv - Machine Learning · 4 min ·
[2602.13194] Semantic Chunking and the Entropy of Natural Language
Llms

[2602.13194] Semantic Chunking and the Entropy of Natural Language

This article presents a statistical model for semantic chunking in natural language, revealing insights into the entropy of English and i...

arXiv - AI · 4 min ·
[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models
Llms

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

The paper presents CoPE-VideoLM, a novel approach that utilizes codec primitives to enhance the efficiency of video language models, sign...

arXiv - AI · 4 min ·
[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures
Llms

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

The paper introduces Krites, an asynchronous caching policy for large language models (LLMs) that enhances semantic caching efficiency wh...

arXiv - AI · 4 min ·
[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech
Machine Learning

[2602.13047] Can we trust AI to detect healthy multilingual English speakers among the cognitively impaired cohort in the UK? An investigation using real-world conversational speech

This study investigates the reliability of AI in detecting cognitive impairment among multilingual English speakers in the UK, revealing ...

arXiv - AI · 4 min ·
[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models
Llms

[2602.12996] Know More, Know Clearer: A Meta-Cognitive Framework for Knowledge Augmentation in Large Language Models

This article presents a novel meta-cognitive framework aimed at enhancing knowledge augmentation in Large Language Models (LLMs), address...

arXiv - AI · 3 min ·
[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues
Robotics

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

This article examines how the availability of knowledge influences the persuasiveness of generative social agents (GSAs) in physiotherapy...

arXiv - AI · 4 min ·
[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training
Llms

[2602.12892] RADAR: Revealing Asymmetric Development of Abilities in MLLM Pre-training

The paper presents RADAR, a novel evaluation framework for Multi-modal Large Language Models (MLLMs) that addresses performance bottlenec...

arXiv - AI · 4 min ·
Previous Page 130 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime