Content Feed

The latest content from across the network

Machine Learning

[2603.13294] Real-World AI Evaluation: How FRAME Generates Systematic Evidence to Resolve the Decision-Maker's Dilemma

Abstract page for arXiv paper 2603.13294: Real-World AI Evaluation: How FRAME Generates Systematic Evidence to Resolve the Decision-Maker's Dilemma

arXiv - AI · 4 min · 3 days ago

Llms

[2603.12564] AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM Agents

Abstract page for arXiv paper 2603.12564: AgentDrift: Unsafe Recommendation Drift Under Tool Corruption Hidden by Ranking Metrics in LLM ...

arXiv - AI · 4 min · 3 days ago

Llms

[2602.08482] CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

Abstract page for arXiv paper 2602.08482: CLEAR: A Knowledge-Centric Vessel Trajectory Analysis Platform

arXiv - AI · 3 min · 3 days ago

Machine Learning

[2603.12057] Coarse-Guided Visual Generation via Weighted h-Transform Sampling

Abstract page for arXiv paper 2603.12057: Coarse-Guided Visual Generation via Weighted h-Transform Sampling

arXiv - AI · 4 min · 3 days ago

Llms

[2603.09964] Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessible for Blind and Low Vision People

Abstract page for arXiv paper 2603.09964: Understanding the Use of a Large Language Model-Powered Guide to Make Virtual Reality Accessibl...

arXiv - AI · 4 min · 3 days ago

Nlp

[2603.09455] Declarative Scenario-based Testing with RoadLogic

Abstract page for arXiv paper 2603.09455: Declarative Scenario-based Testing with RoadLogic

arXiv - AI · 3 min · 3 days ago

Machine Learning

[2603.07554] Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

Abstract page for arXiv paper 2603.07554: Nwāchā Munā: A Devanagari Speech Corpus and Proximal Transfer Benchmark for Nepal Bhasha ASR

arXiv - AI · 4 min · 3 days ago

Machine Learning

[2603.07455] Image Generation Models: A Technical History

Abstract page for arXiv paper 2603.07455: Image Generation Models: A Technical History

arXiv - AI · 3 min · 3 days ago

Llms

[2602.00665] Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic Data-Driven Comparative Evaluation

Abstract page for arXiv paper 2602.00665: Can Small Language Models Handle Context-Summarized Multi-Turn Customer-Service QA? A Synthetic...

arXiv - AI · 4 min · 3 days ago

Llms

[2601.22452] Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot Interaction

Abstract page for arXiv paper 2601.22452: Does My Chatbot Have an Agenda? Understanding Human and AI Agency in Human-Human-like Chatbot I...

arXiv - AI · 3 min · 3 days ago

Llms

[2601.20404] On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents

Abstract page for arXiv paper 2601.20404: On the Impact of AGENTS.md Files on the Efficiency of AI Coding Agents

arXiv - AI · 4 min · 3 days ago

Llms

[2601.18987] LLMs versus the Halting Problem: Revisiting Program Termination Prediction

Abstract page for arXiv paper 2601.18987: LLMs versus the Halting Problem: Revisiting Program Termination Prediction

arXiv - AI · 4 min · 3 days ago

[2601.11065] Fairness in Healthcare Processes: A Quantitative Analysis of Decision Making in Triage

Abstract page for arXiv paper 2601.11065: Fairness in Healthcare Processes: A Quantitative Analysis of Decision Making in Triage

arXiv - AI · 4 min · 3 days ago

Llms

[2601.04497] Vision-Language Agents for Interactive Forest Change Analysis

Abstract page for arXiv paper 2601.04497: Vision-Language Agents for Interactive Forest Change Analysis

arXiv - AI · 4 min · 3 days ago

Llms

[2601.01627] JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models

Abstract page for arXiv paper 2601.01627: JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese...

arXiv - AI · 4 min · 3 days ago

Llms

[2601.00809] A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

Abstract page for arXiv paper 2601.00809: A Modular Reference Architecture for MCP-Servers Enabling Agentic BIM Interaction

arXiv - AI · 4 min · 3 days ago

Machine Learning

[2512.22065] StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars

Abstract page for arXiv paper 2512.22065: StreamAvatar: Streaming Diffusion Models for Real-Time Interactive Human Avatars

arXiv - AI · 4 min · 3 days ago

Data Science

[2512.17396] RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

Abstract page for arXiv paper 2512.17396: RadImageNet-VQA: A Large-Scale CT and MRI Dataset for Radiologic Visual Question Answering

arXiv - AI · 3 min · 3 days ago

Machine Learning

[2511.23342] Overcoming the Curvature Bottleneck in MeanFlow

Abstract page for arXiv paper 2511.23342: Overcoming the Curvature Bottleneck in MeanFlow

arXiv - AI · 4 min · 3 days ago

Llms

[2512.12812] Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, and LLaMA

Abstract page for arXiv paper 2512.12812: Does Tone Change the Answer? Evaluating Prompt Politeness Effects on Modern LLMs: GPT, Gemini, ...

arXiv - AI · 4 min · 3 days ago