Natural Language Processing

Text understanding and language tasks

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min · about 4 hours ago

Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min · about 6 hours ago

Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min · about 15 hours ago

All Content

Llms

[2510.10889] Topological Alignment of Shared Vision-Language Embedding Space

Abstract page for arXiv paper 2510.10889: Topological Alignment of Shared Vision-Language Embedding Space

arXiv - AI · 3 min · 26 days ago

Llms

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

Abstract page for arXiv paper 2510.07181: TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2509.25845] Training-Free Reward-Guided Image Editing via Trajectory Optimal Control

Abstract page for arXiv paper 2509.25845: Training-Free Reward-Guided Image Editing via Trajectory Optimal Control

arXiv - AI · 3 min · 26 days ago

Llms

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

Abstract page for arXiv paper 2509.24222: Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2509.14858] MeanFlowSE: one-step generative speech enhancement via conditional mean flow

Abstract page for arXiv paper 2509.14858: MeanFlowSE: one-step generative speech enhancement via conditional mean flow

arXiv - AI · 3 min · 26 days ago

Llms

[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

Abstract page for arXiv paper 2508.07321: ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answe...

arXiv - AI · 3 min · 26 days ago

Llms

[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

Abstract page for arXiv paper 2507.09875: Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

arXiv - AI · 4 min · 26 days ago

Llms

[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

Abstract page for arXiv paper 2507.07847: From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Au...

arXiv - AI · 4 min · 26 days ago

Llms

[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Abstract page for arXiv paper 2601.17473: LeanTutor: Towards a Verified AI Mathematical Proof Tutor

arXiv - Machine Learning · 3 min · 26 days ago

Llms

[2504.07109] OSCAR: Online Soft Compression And Reranking

Abstract page for arXiv paper 2504.07109: OSCAR: Online Soft Compression And Reranking

arXiv - AI · 3 min · 26 days ago

Llms

[2503.07885] Safety Guardrails for LLM-Enabled Robots

Abstract page for arXiv paper 2503.07885: Safety Guardrails for LLM-Enabled Robots

arXiv - AI · 4 min · 26 days ago

Llms

[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

Abstract page for arXiv paper 2511.22935: EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2510.16462] Buzz, Choose, Forget: A Meta-Bandit Framework for Bee-Like Decision Making

Abstract page for arXiv paper 2510.16462: Buzz, Choose, Forget: A Meta-Bandit Framework for Bee-Like Decision Making

arXiv - Machine Learning · 3 min · 26 days ago

Llms

[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

Abstract page for arXiv paper 2412.13091: LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

arXiv - AI · 3 min · 26 days ago

Llms

[2406.06512] Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

Abstract page for arXiv paper 2406.06512: Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2510.07151] ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

Abstract page for arXiv paper 2510.07151: ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

arXiv - AI · 4 min · 26 days ago

Llms

[2405.15374] Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

Abstract page for arXiv paper 2405.15374: Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

arXiv - AI · 4 min · 26 days ago

Machine Learning

[2509.22580] The Lie of the Average: How Class Incremental Learning Evaluation Deceives You?

Abstract page for arXiv paper 2509.22580: The Lie of the Average: How Class Incremental Learning Evaluation Deceives You?

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

Abstract page for arXiv paper 2509.21150: CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

arXiv - Machine Learning · 4 min · 26 days ago

Machine Learning

[2506.23971] UMA: A Family of Universal Models for Atoms

Abstract page for arXiv paper 2506.23971: UMA: A Family of Universal Models for Atoms

arXiv - Machine Learning · 4 min · 26 days ago

Previous Page 29 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Natural Language Processing

Top This Week

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

All Content

[2510.10889] Topological Alignment of Shared Vision-Language Embedding Space

[2510.07181] TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics

[2509.25845] Training-Free Reward-Guided Image Editing via Trajectory Optimal Control

[2509.24222] Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning

[2509.14858] MeanFlowSE: one-step generative speech enhancement via conditional mean flow

[2508.07321] ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering

[2507.09875] Function Induction and Task Generalization: An Interpretability Study with Off-by-One Addition

[2507.07847] From Ambiguity to Accuracy: The Transformative Effect of Coreference Resolution on Retrieval-Augmented Generation systems

[2601.17473] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

[2504.07109] OSCAR: Online Soft Compression And Reranking

[2503.07885] Safety Guardrails for LLM-Enabled Robots

[2511.22935] EnECG: Efficient Ensemble Learning for Electrocardiogram Multi-task Foundation Model

[2510.16462] Buzz, Choose, Forget: A Meta-Bandit Framework for Bee-Like Decision Making

[2412.13091] LMUnit: Fine-grained Evaluation with Natural Language Unit Tests

[2406.06512] Merlin: A Computed Tomography Vision-Language Foundation Model and Dataset

[2510.07151] ELMUR: External Layer Memory with Update/Rewrite for Long-Horizon RL Problems

[2405.15374] Leveraging Large Language Models for Semantic Query Processing in a Scholarly Knowledge Graph

[2509.22580] The Lie of the Average: How Class Incremental Learning Evaluation Deceives You?

[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

[2506.23971] UMA: A Family of Universal Models for Atoms

Related Topics

Stay updated with AI News