Natural Language Processing

Text understanding and language tasks

Top This Week

[2601.11016] Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach
Nlp

[2601.11016] Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

Abstract page for arXiv paper 2601.11016: Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interp...

arXiv - Machine Learning · 4 min ·
[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology
Machine Learning

[2511.22294] Structure is Supervision: Multiview Masked Autoencoders for Radiology

Abstract page for arXiv paper 2511.22294: Structure is Supervision: Multiview Masked Autoencoders for Radiology

arXiv - Machine Learning · 4 min ·
[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models
Llms

[2511.18123] Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-Language Models

Abstract page for arXiv paper 2511.18123: Bias Is a Subspace, Not a Coordinate: A Geometric Rethinking of Post-hoc Debiasing in Vision-La...

arXiv - Machine Learning · 4 min ·

All Content

[2602.20344] Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction
Nlp

[2602.20344] Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

This article presents GraSPNet, a novel hierarchical self-supervised learning framework for molecular representation that enhances graph ...

arXiv - Machine Learning · 3 min ·
[2602.20300] What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance
Llms

[2602.20300] What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance

This article examines how specific linguistic features of queries impact the performance of Large Language Models (LLMs), particularly in...

arXiv - AI · 3 min ·
[2602.20224] Exploring Anti-Aging Literature via ConvexTopics and Large Language Models
Llms

[2602.20224] Exploring Anti-Aging Literature via ConvexTopics and Large Language Models

This article presents a novel clustering algorithm for analyzing anti-aging literature, improving topic modeling through convex optimizat...

arXiv - Machine Learning · 3 min ·
[2602.20219] An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction
Llms

[2602.20219] An Approach to Combining Video and Speech with Large Language Models in Human-Robot Interaction

This article presents a novel multimodal framework for human-robot interaction that integrates video and speech processing with large lan...

arXiv - AI · 3 min ·
[2602.20213] CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions
Llms

[2602.20213] CodeHacker: Automated Test Case Generation for Detecting Vulnerabilities in Competitive Programming Solutions

CodeHacker is an automated framework designed to generate test cases that identify vulnerabilities in competitive programming solutions, ...

arXiv - AI · 3 min ·
[2310.15741] Interpretable Medical Image Classification using Prototype Learning and Privileged Information
Machine Learning

[2310.15741] Interpretable Medical Image Classification using Prototype Learning and Privileged Information

This article presents a novel approach to medical image classification using prototype learning and privileged information, enhancing int...

arXiv - AI · 3 min ·
[2602.21143] A Benchmark for Deep Information Synthesis
Llms

[2602.21143] A Benchmark for Deep Information Synthesis

The paper introduces DEEPSYNTH, a benchmark for evaluating large language models on complex tasks requiring deep information synthesis an...

arXiv - Machine Learning · 4 min ·
[2602.21044] LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification
Llms

[2602.21044] LogicGraph : Benchmarking Multi-Path Logical Reasoning via Neuro-Symbolic Generation and Verification

LogicGraph introduces a benchmark for evaluating multi-path logical reasoning in large language models, highlighting their limitations in...

arXiv - AI · 4 min ·
[2602.20934] Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence
Llms

[2602.20934] Architecting AgentOS: From Token-Level Context to Emergent System-Level Intelligence

The paper introduces AgentOS, a conceptual framework that transitions Large Language Models from static inference engines to dynamic cogn...

arXiv - AI · 3 min ·
[2602.20918] Predicting Sentence Acceptability Judgments in Multimodal Contexts
Llms

[2602.20918] Predicting Sentence Acceptability Judgments in Multimodal Contexts

This paper explores how visual context influences sentence acceptability judgments in humans and large language models (LLMs), revealing ...

arXiv - AI · 4 min ·
[2602.20926] HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG
Llms

[2602.20926] HELP: HyperNode Expansion and Logical Path-Guided Evidence Localization for Accurate and Efficient GraphRAG

This article presents the HELP framework, which enhances Retrieval-Augmented Generation (RAG) by addressing knowledge boundaries and hall...

arXiv - AI · 4 min ·
[2602.20878] Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs
Llms

[2602.20878] Diagnosing Causal Reasoning in Vision-Language Models via Structured Relevance Graphs

This article introduces Vision-Language Causal Graphs (VLCGs) to enhance causal reasoning in Vision-Language Models (LVLMs), addressing t...

arXiv - AI · 3 min ·
[2602.20696] PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding
Llms

[2602.20696] PromptCD: Test-Time Behavior Enhancement via Polarity-Prompt Contrastive Decoding

The paper presents PromptCD, a method for enhancing AI behavior at test time using polarity-prompt contrastive decoding, improving alignm...

arXiv - AI · 4 min ·
[2602.20624] Physics-based phenomenological characterization of cross-modal bias in multimodal models
Llms

[2602.20624] Physics-based phenomenological characterization of cross-modal bias in multimodal models

This paper explores the cross-modal bias in multimodal large language models (MLLMs) through a physics-based phenomenological approach, a...

arXiv - AI · 4 min ·
[2602.20571] CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation
Machine Learning

[2602.20571] CausalReasoningBenchmark: A Real-World Benchmark for Disentangled Evaluation of Causal Identification and Estimation

The CausalReasoningBenchmark introduces a new framework for evaluating automated causal inference, distinguishing between identification ...

arXiv - AI · 4 min ·
[2602.20558] From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production
Llms

[2602.20558] From Logs to Language: Learning Optimal Verbalization for LLM-Based Recommendation in Production

This paper explores a data-centric framework for optimizing verbalization in LLM-based recommendation systems, enhancing recommendation a...

arXiv - AI · 3 min ·
[2602.20426] Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use
Llms

[2602.20426] Learning to Rewrite Tool Descriptions for Reliable LLM-Agent Tool Use

This paper presents Trace-Free+, a curriculum learning framework designed to enhance the quality of tool interfaces for LLM-based agents,...

arXiv - AI · 3 min ·
[2602.20324] An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models
Llms

[2602.20324] An artificial intelligence framework for end-to-end rare disease phenotyping from clinical notes using large language models

This article presents RARE-PHENIX, an AI framework designed for end-to-end phenotyping of rare diseases from clinical notes, utilizing la...

arXiv - Machine Learning · 4 min ·
Anthropic won’t budge as Pentagon escalates AI dispute | TechCrunch
Nlp

Anthropic won’t budge as Pentagon escalates AI dispute | TechCrunch

The Pentagon demands Anthropic to loosen AI restrictions or face penalties, raising concerns over government control, vendor reliance, an...

TechCrunch - AI · 5 min ·
Llms

[R] Understanding targeted LLM fine-tuning

This article discusses a preprint on targeted instruction selection for fine-tuning large language models (LLMs), emphasizing systematic ...

Reddit - Machine Learning · 1 min ·
Previous Page 83 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime