Natural Language Processing

Text understanding and language tasks

Top This Week

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min ·
[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·

All Content

[2603.04805] Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation
Llms

[2603.04805] Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

Abstract page for arXiv paper 2603.04805: Attention's Gravitational Field:A Power-Law Interpretation of Positional Correlation

arXiv - AI · 3 min ·
[2603.04799] Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm
Llms

[2603.04799] Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm

Abstract page for arXiv paper 2603.04799: Beyond Linear LLM Invocation: An Efficient and Effective Semantic Filter Paradigm

arXiv - AI · 4 min ·
[2603.04772] TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings
Llms

[2603.04772] TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

Abstract page for arXiv paper 2603.04772: TSEmbed: Unlocking Task Scaling in Universal Multimodal Embeddings

arXiv - AI · 3 min ·
[2603.04743] DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval
Llms

[2603.04743] DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

Abstract page for arXiv paper 2603.04743: DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

arXiv - AI · 4 min ·
[2603.04718] AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments
Machine Learning

[2603.04718] AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments

Abstract page for arXiv paper 2603.04718: AI-Assisted Moot Courts: Simulating Justice-Specific Questioning in Oral Arguments

arXiv - AI · 4 min ·
[2603.04663] Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency Hallucination Detector
Llms

[2603.04663] Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency Hallucination Detector

Abstract page for arXiv paper 2603.04663: Neuro-Symbolic Financial Reasoning via Deterministic Fact Ledgers and Adversarial Low-Latency H...

arXiv - Machine Learning · 4 min ·
[2603.04659] GIANT - Global Path Integration and Attentive Graph Networks for Multi-Agent Trajectory Planning
Machine Learning

[2603.04659] GIANT - Global Path Integration and Attentive Graph Networks for Multi-Agent Trajectory Planning

Abstract page for arXiv paper 2603.04659: GIANT - Global Path Integration and Attentive Graph Networks for Multi-Agent Trajectory Planning

arXiv - AI · 4 min ·
[2603.04597] Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning
Llms

[2603.04597] Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

Abstract page for arXiv paper 2603.04597: Bootstrapping Exploration with Group-Level Natural Language Feedback in Reinforcement Learning

arXiv - AI · 4 min ·
[2603.04532] Still Fresh? Evaluating Temporal Drift in Retrieval Benchmarks
Nlp

[2603.04532] Still Fresh? Evaluating Temporal Drift in Retrieval Benchmarks

Abstract page for arXiv paper 2603.04532: Still Fresh? Evaluating Temporal Drift in Retrieval Benchmarks

arXiv - AI · 3 min ·
[2603.04450] MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering
Machine Learning

[2603.04450] MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering

Abstract page for arXiv paper 2603.04450: MPBMC: Multi-Property Bounded Model Checking with GNN-guided Clustering

arXiv - Machine Learning · 3 min ·
[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems
Llms

[2603.04443] AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

Abstract page for arXiv paper 2603.04443: AMV-L: Lifecycle-Managed Agent Memory for Tail-Latency Control in Long-Running LLM Systems

arXiv - Machine Learning · 4 min ·
[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs
Llms

[2603.04429] What Is Missing: Interpretable Ratings for Large Language Model Outputs

Abstract page for arXiv paper 2603.04429: What Is Missing: Interpretable Ratings for Large Language Model Outputs

arXiv - AI · 4 min ·
[2603.04422] FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated Learning
Machine Learning

[2603.04422] FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated Learning

Abstract page for arXiv paper 2603.04422: FedEMA-Distill: Exponential Moving Average Guided Knowledge Distillation for Robust Federated L...

arXiv - Machine Learning · 4 min ·
[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?
Llms

[2603.04421] Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

Abstract page for arXiv paper 2603.04421: Do Mixed-Vendor Multi-Agent LLMs Improve Clinical Diagnosis?

arXiv - AI · 3 min ·
[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models
Llms

[2603.04410] SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

Abstract page for arXiv paper 2603.04410: SalamahBench: Toward Standardized Safety Evaluation for Arabic Language Models

arXiv - AI · 4 min ·
[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models
Llms

[2603.04406] CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models

Abstract page for arXiv paper 2603.04406: CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG M...

arXiv - AI · 4 min ·
[2603.04403] FinRetrieval: A Benchmark for Financial Data Retrieval by AI Agents
Nlp

[2603.04403] FinRetrieval: A Benchmark for Financial Data Retrieval by AI Agents

Abstract page for arXiv paper 2603.04403: FinRetrieval: A Benchmark for Financial Data Retrieval by AI Agents

arXiv - AI · 3 min ·
[2603.05295] WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces
Nlp

[2603.05295] WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

Abstract page for arXiv paper 2603.05295: WebChain: A Large-Scale Human-Annotated Dataset of Real-World Web Interaction Traces

arXiv - AI · 3 min ·
[2603.05225] AI+HW 2035: Shaping the Next Decade
Nlp

[2603.05225] AI+HW 2035: Shaping the Next Decade

Abstract page for arXiv paper 2603.05225: AI+HW 2035: Shaping the Next Decade

arXiv - AI · 4 min ·
[2603.05129] MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus
Llms

[2603.05129] MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty Consensus

Abstract page for arXiv paper 2603.05129: MedCoRAG: Interpretable Hepatology Diagnosis via Hybrid Evidence Retrieval and Multispecialty C...

arXiv - AI · 4 min ·
Previous Page 27 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime