Natural Language Processing

Text understanding and language tasks

Top This Week

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min ·
[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·

All Content

[2602.09937] Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?
Llms

[2602.09937] Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

Abstract page for arXiv paper 2602.09937: Why Do AI Agents Systematically Fail at Cloud Root Cause Analysis?

arXiv - AI · 4 min ·
[2506.05634] AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization
Nlp

[2506.05634] AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization

Abstract page for arXiv paper 2506.05634: AutoQD: Automatic Discovery of Diverse Behaviors with Quality-Diversity Optimization

arXiv - AI · 4 min ·
[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space
Llms

[2510.09782] The Geometry of Reasoning: Flowing Logics in Representation Space

Abstract page for arXiv paper 2510.09782: The Geometry of Reasoning: Flowing Logics in Representation Space

arXiv - Machine Learning · 4 min ·
[2505.15643] Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima
Nlp

[2505.15643] Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

Abstract page for arXiv paper 2505.15643: Optimal Best-Arm Identification under Fixed Confidence with Multiple Optima

arXiv - Machine Learning · 3 min ·
[2505.13033] TSPulse: Tiny Pre-Trained Models with Disentangled Representations for Rapid Time-Series Analysis
Machine Learning

[2505.13033] TSPulse: Tiny Pre-Trained Models with Disentangled Representations for Rapid Time-Series Analysis

Abstract page for arXiv paper 2505.13033: TSPulse: Tiny Pre-Trained Models with Disentangled Representations for Rapid Time-Series Analysis

arXiv - AI · 4 min ·
[2503.07638] Leveraging Taxonomy Similarity for Next Activity Prediction in Patient Treatment
Nlp

[2503.07638] Leveraging Taxonomy Similarity for Next Activity Prediction in Patient Treatment

Abstract page for arXiv paper 2503.07638: Leveraging Taxonomy Similarity for Next Activity Prediction in Patient Treatment

arXiv - AI · 4 min ·
[2506.08321] LeanTutor: Towards a Verified AI Mathematical Proof Tutor
Llms

[2506.08321] LeanTutor: Towards a Verified AI Mathematical Proof Tutor

Abstract page for arXiv paper 2506.08321: LeanTutor: Towards a Verified AI Mathematical Proof Tutor

arXiv - AI · 3 min ·
[2505.21668] R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning
Llms

[2505.21668] R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

Abstract page for arXiv paper 2505.21668: R1-Code-Interpreter: LLMs Reason with Code via Supervised and Multi-stage Reinforcement Learning

arXiv - AI · 4 min ·
[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living
Llms

[2504.20505] MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities of Daily Living

Abstract page for arXiv paper 2504.20505: MuRAL: A Multi-Resident Ambient Sensor Dataset Annotated with Natural Language for Activities o...

arXiv - AI · 4 min ·
[2310.04925] Crystal-GFN: sampling crystals with desirable properties and constraints
Nlp

[2310.04925] Crystal-GFN: sampling crystals with desirable properties and constraints

Abstract page for arXiv paper 2310.04925: Crystal-GFN: sampling crystals with desirable properties and constraints

arXiv - Machine Learning · 4 min ·
[2603.04353] A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications
Nlp

[2603.04353] A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications

Abstract page for arXiv paper 2603.04353: A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications

arXiv - Machine Learning · 3 min ·
[2603.04348] RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation
Machine Learning

[2603.04348] RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Report Generation

Abstract page for arXiv paper 2603.04348: RANGER: Sparsely-Gated Mixture-of-Experts with Adaptive Retrieval Re-ranking for Pathology Repo...

arXiv - AI · 4 min ·
[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings
Llms

[2603.04317] World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Abstract page for arXiv paper 2603.04317: World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurr...

arXiv - AI · 3 min ·
[2603.04321] SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Learning
Nlp

[2603.04321] SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Learning

Abstract page for arXiv paper 2603.04321: SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Lear...

arXiv - AI · 3 min ·
[2603.04204] Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means
Machine Learning

[2603.04204] Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized Means

Abstract page for arXiv paper 2603.04204: Beyond Mixtures and Products for Ensemble Aggregation: A Likelihood Perspective on Generalized ...

arXiv - Machine Learning · 4 min ·
[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance
Llms

[2603.04293] LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

Abstract page for arXiv paper 2603.04293: LabelBuddy: An Open Source Music and Audio Language Annotation Tagging Tool Using AI Assistance

arXiv - AI · 3 min ·
[2603.04005] Training-Free Rate-Distortion-Perception Traversal With Diffusion
Machine Learning

[2603.04005] Training-Free Rate-Distortion-Perception Traversal With Diffusion

Abstract page for arXiv paper 2603.04005: Training-Free Rate-Distortion-Perception Traversal With Diffusion

arXiv - Machine Learning · 3 min ·
[2603.04158] GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning
Nlp

[2603.04158] GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning

Abstract page for arXiv paper 2603.04158: GarmentPile++: Affordance-Driven Cluttered Garments Retrieval with Vision-Language Reasoning

arXiv - AI · 4 min ·
[2603.03843] Invariance-Based Dynamic Regret Minimization
Machine Learning

[2603.03843] Invariance-Based Dynamic Regret Minimization

Abstract page for arXiv paper 2603.03843: Invariance-Based Dynamic Regret Minimization

arXiv - Machine Learning · 3 min ·
[2603.04037] DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval
Nlp

[2603.04037] DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative Negative Sampling in Composed Image Retrieval

Abstract page for arXiv paper 2603.04037: DQE-CIR: Distinctive Query Embeddings through Learnable Attribute Weights and Target Relative N...

arXiv - AI · 4 min ·
Previous Page 30 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime