Natural Language Processing

Text understanding and language tasks

Top This Week

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min ·
[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·

All Content

[2603.05024] Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) for Business Decision Support Systems
Machine Learning

[2603.05024] Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) for Business Decision Support Systems

Abstract page for arXiv paper 2603.05024: Measuring the Fragility of Trust: Devising Credibility Index via Explanation Stability (CIES) f...

arXiv - Machine Learning · 4 min ·
[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection
Machine Learning

[2603.04981] Rethinking Representativeness and Diversity in Dynamic Data Selection

Abstract page for arXiv paper 2603.04981: Rethinking Representativeness and Diversity in Dynamic Data Selection

arXiv - AI · 4 min ·
[2603.04951] Retrieval-Augmented Generation with Covariate Time Series
Llms

[2603.04951] Retrieval-Augmented Generation with Covariate Time Series

Abstract page for arXiv paper 2603.04951: Retrieval-Augmented Generation with Covariate Time Series

arXiv - AI · 4 min ·
[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation
Llms

[2603.04868] K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory Generation

Abstract page for arXiv paper 2603.04868: K-Gen: A Multimodal Language-Conditioned Approach for Interpretable Keypoint-Guided Trajectory ...

arXiv - AI · 3 min ·
[2603.04756] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem
Nlp

[2603.04756] MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

Abstract page for arXiv paper 2603.04756: MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

arXiv - AI · 4 min ·
[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics
Llms

[2603.04741] CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

Abstract page for arXiv paper 2603.04741: CONE: Embeddings for Complex Numerical Data Preserving Unit and Variable Semantics

arXiv - Machine Learning · 3 min ·
[2603.04448] SkillNet: Create, Evaluate, and Connect AI Skills
Nlp

[2603.04448] SkillNet: Create, Evaluate, and Connect AI Skills

Abstract page for arXiv paper 2603.04448: SkillNet: Create, Evaluate, and Connect AI Skills

arXiv - Machine Learning · 4 min ·
Machine Learning

ML Prague conference to feature Taiwan pavilion to deepen cooperation-News-Radio Taiwan International

The ML Prague conference will feature a Taiwan pavilion focused on deepening cooperation.

AI News - General · 1 min ·
Llms

[D] M1 Pro is hitting a wall with LLMs. Upgrade to M5 Max now or wait for the M6 redesign?

I'm an AI Engineer currently daily-driving a 16" M1 Pro MBP. It’s been a workhorse, but I’m feeling the bottleneck when running larger lo...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] AMA Secure version of OpenClaw

There’s a major risk that OpenClaw will exploit your data and funds. So I built a security focused version in Rust. AMA. I was incredibly...

Reddit - Machine Learning · 1 min ·
Google faces first lawsuit alleging its AI chatbot encouraged a Florida man to commit suicide
Nlp

Google faces first lawsuit alleging its AI chatbot encouraged a Florida man to commit suicide

AI Tools & Products · 5 min ·
[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires
Nlp

[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

Abstract page for arXiv paper 2602.10149: Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionna...

arXiv - AI · 4 min ·
[2601.19933] NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference
Llms

[2601.19933] NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

Abstract page for arXiv paper 2601.19933: NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

arXiv - AI · 4 min ·
[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search
Machine Learning

[2601.04646] Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant Search

Abstract page for arXiv paper 2601.04646: Succeeding at Scale: Automated Dataset Construction and Query-Side Adaptation for Multi-Tenant ...

arXiv - AI · 4 min ·
[2601.00361] Deterministic Coreset for Lp Subspace
Nlp

[2601.00361] Deterministic Coreset for Lp Subspace

Abstract page for arXiv paper 2601.00361: Deterministic Coreset for Lp Subspace

arXiv - Machine Learning · 4 min ·
[2510.02578] FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction
Llms

[2510.02578] FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D ligand generation and affinity prediction

Abstract page for arXiv paper 2510.02578: FLOWR.root: A flow matching based foundation model for joint multi-purpose structure-aware 3D l...

arXiv - Machine Learning · 4 min ·
[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks
Machine Learning

[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

Abstract page for arXiv paper 2509.25095: Benchmarking ECG FMs: A Reality Check Across Clinical Tasks

arXiv - Machine Learning · 4 min ·
[2508.09844] On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators
Machine Learning

[2508.09844] On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators

Abstract page for arXiv paper 2508.09844: On the Generalization Limits of Quantum Generative Adversarial Networks with Pure State Generators

arXiv - Machine Learning · 3 min ·
[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents
Llms

[2510.24702] Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Abstract page for arXiv paper 2510.24702: Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

arXiv - AI · 4 min ·
[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations
Llms

[2510.24178] MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

Abstract page for arXiv paper 2510.24178: MuSaG: A Multimodal German Sarcasm Dataset with Full-Modal Annotations

arXiv - AI · 4 min ·
Previous Page 28 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime