Natural Language Processing

Text understanding and language tasks

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min · about 2 hours ago

Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min · about 4 hours ago

Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min · about 14 hours ago

All Content

Machine Learning

[2603.05024] Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) for Business Decision Support Systems

Abstract page for arXiv paper 2603.05024: Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) f...

arXiv - Machine Learning · 4 min · 24 days ago

Machine Learning

[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection

Abstract page for arXiv paper 2603.04981: Rethinking Representativeness and Diversity in Dynamic Data Selection

arXiv - AI · 4 min · 24 days ago

Llms

[2603.04951] Retrieval-Augmented Generation with Covariate Time Series

Abstract page for arXiv paper 2603.04951: Retrieval-Augmented Generation with Covariate Time Series

arXiv - AI · 4 min · 24 days ago

Llms

[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Abstract page for arXiv paper 2603.04868: K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory ...

arXiv - AI · 3 min · 24 days ago

Nlp

[2603.04756] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

Abstract page for arXiv paper 2603.04756: MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

arXiv - AI · 4 min · 24 days ago

Llms

[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

Abstract page for arXiv paper 2603.04741: CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

arXiv - Machine Learning · 3 min · 24 days ago

Nlp

[2603.04448] SkillNet: Create, Evaluate, and Connect AI Skills

Abstract page for arXiv paper 2603.04448: SkillNet: Create, Evaluate, and Connect AI Skills

arXiv - Machine Learning · 4 min · 24 days ago

Machine Learning

ML Prague conference to feature Taiwan pavilion to deepen cooperation-News-Radio Taiwan International

The ML Prague conference will feature a Taiwan pavilion focused on deepening cooperation.

AI News - General · 1 min · 25 days ago

Llms

[D] M1 Pro is hitting a wall with LLMs. Upgrade to M5 Max now or wait for the M6 redesign?

I'm an AI Engineer currently daily-driving a 16" M1 Pro MBP. It’s been a workhorse, but I’m feeling the bottleneck when running larger lo...

Reddit - Machine Learning · 1 min · 25 days ago

Machine Learning

[D] AMA Secure version of OpenClaw

There’s a major risk that OpenClaw will exploit your data and funds. So I built a security focused version in Rust. AMA. I was incredibly...

Reddit - Machine Learning · 1 min · 25 days ago

Nlp

Google faces first lawsuit alleging its AI chatbot encouraged a Florida man to commit suicide

AI Tools & Products · 5 min · 25 days ago

Nlp

[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

Abstract page for arXiv paper 2602.10149: Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionna...

arXiv - AI · 4 min · 25 days ago

Llms

[2601.19933] NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

Abstract page for arXiv paper 2601.19933: NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

arXiv - AI · 4 min · 25 days ago

Machine Learning

[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

Abstract page for arXiv paper 2601.04646: Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant ...

arXiv - AI · 4 min · 25 days ago

Nlp

[2601.00361] Deterministic Coreset for Lp Subspace

Abstract page for arXiv paper 2601.00361: Deterministic Coreset for Lp Subspace

arXiv - Machine Learning · 4 min · 25 days ago

Llms

[2510.02578] FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction

Abstract page for arXiv paper 2510.02578: FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D l...

arXiv - Machine Learning · 4 min · 25 days ago

Machine Learning

[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

Abstract page for arXiv paper 2509.25095: Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

arXiv - Machine Learning · 4 min · 25 days ago

Machine Learning

[2508.09844] On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators

Abstract page for arXiv paper 2508.09844: On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators

arXiv - Machine Learning · 3 min · 25 days ago

Llms

[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Abstract page for arXiv paper 2510.24702: Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

arXiv - AI · 4 min · 25 days ago

Llms

[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

Abstract page for arXiv paper 2510.24178: MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

arXiv - AI · 4 min · 25 days ago

Previous Page 28 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Natural Language Processing

Top This Week

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

All Content

[2603.05024] Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) for Business Decision Support Systems

[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection

[2603.04951] Retrieval-Augmented Generation with Covariate Time Series

[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

[2603.04756] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

[2603.04448] SkillNet: Create, Evaluate, and Connect AI Skills

ML Prague conference to feature Taiwan pavilion to deepen cooperation-News-Radio Taiwan International

[D] M1 Pro is hitting a wall with LLMs. Upgrade to M5 Max now or wait for the M6 redesign?

[D] AMA Secure version of OpenClaw

Google faces first lawsuit alleging its AI chatbot encouraged a Florida man to commit suicide

[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

[2601.19933] NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

[2601.00361] Deterministic Coreset for Lp Subspace

[2510.02578] FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction

[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

[2508.09844] On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators

[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

Related Topics

Stay updated with AI News