Machine Learning

ML algorithms, training, and inference

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments
Machine Learning

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

Hub Group says it’s using artificial intelligence and machine learning to leverage data from its GPS-equipped container fleet to give cus...

AI Events · 4 min ·

All Content

[2603.25253] MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation
Llms

[2603.25253] MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation

Abstract page for arXiv paper 2603.25253: MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Eluci...

arXiv - AI · 4 min ·
[2603.25397] A Causal Framework for Evaluating ICU Discharge Strategies
Machine Learning

[2603.25397] A Causal Framework for Evaluating ICU Discharge Strategies

Abstract page for arXiv paper 2603.25397: A Causal Framework for Evaluating ICU Discharge Strategies

arXiv - AI · 3 min ·
[2603.25374] Supercharging Federated Intelligence Retrieval
Llms

[2603.25374] Supercharging Federated Intelligence Retrieval

Abstract page for arXiv paper 2603.25374: Supercharging Federated Intelligence Retrieval

arXiv - Machine Learning · 3 min ·
[2603.25247] FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics
Machine Learning

[2603.25247] FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

Abstract page for arXiv paper 2603.25247: FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

arXiv - AI · 4 min ·
[2603.25243] FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA
Llms

[2603.25243] FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA

Abstract page for arXiv paper 2603.25243: FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA

arXiv - AI · 3 min ·
[2603.25311] Practical Efficient Global Optimization is No-regret
Machine Learning

[2603.25311] Practical Efficient Global Optimization is No-regret

Abstract page for arXiv paper 2603.25311: Practical Efficient Global Optimization is No-regret

arXiv - Machine Learning · 3 min ·
[2603.25226] WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing
Llms

[2603.25226] WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

Abstract page for arXiv paper 2603.25226: WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

arXiv - AI · 4 min ·
[2603.25216] A Wireless World Model for AI-Native 6G Networks
Machine Learning

[2603.25216] A Wireless World Model for AI-Native 6G Networks

Abstract page for arXiv paper 2603.25216: A Wireless World Model for AI-Native 6G Networks

arXiv - AI · 3 min ·
[2603.25257] Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening
Machine Learning

[2603.25257] Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

Abstract page for arXiv paper 2603.25257: Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

arXiv - Machine Learning · 3 min ·
[2603.25209] Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction
Machine Learning

[2603.25209] Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

Abstract page for arXiv paper 2603.25209: Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

arXiv - AI · 4 min ·
[2603.25196] A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations
Llms

[2603.25196] A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations

Abstract page for arXiv paper 2603.25196: A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence ...

arXiv - AI · 4 min ·
[2603.25251] Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding
Machine Learning

[2603.25251] Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

Abstract page for arXiv paper 2603.25251: Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

arXiv - AI · 4 min ·
[2603.25187] Probing the Lack of Stable Internal Beliefs in LLMs
Llms

[2603.25187] Probing the Lack of Stable Internal Beliefs in LLMs

Abstract page for arXiv paper 2603.25187: Probing the Lack of Stable Internal Beliefs in LLMs

arXiv - AI · 3 min ·
[2603.25229] An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machine Learning Models
Machine Learning

[2603.25229] An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machine Learning Models

Abstract page for arXiv paper 2603.25229: An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machin...

arXiv - Machine Learning · 4 min ·
[2603.25250] Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models
Llms

[2603.25250] Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models

Abstract page for arXiv paper 2603.25250: Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language ...

arXiv - AI · 4 min ·
[2603.25170] Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling
Machine Learning

[2603.25170] Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

Abstract page for arXiv paper 2603.25170: Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

arXiv - AI · 4 min ·
[2603.25164] PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems
Llms

[2603.25164] PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Abstract page for arXiv paper 2603.25164: PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented ...

arXiv - AI · 4 min ·
[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment
Llms

[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment

Abstract page for arXiv paper 2603.25145: Learning to Rank Caption Chains for Video-Text Alignment

arXiv - Machine Learning · 3 min ·
[2603.25155] Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models
Llms

[2603.25155] Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

Abstract page for arXiv paper 2603.25155: Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

arXiv - AI · 3 min ·
[2603.25150] Goodness-of-pronunciation without phoneme time alignment
Machine Learning

[2603.25150] Goodness-of-pronunciation without phoneme time alignment

Abstract page for arXiv paper 2603.25150: Goodness-of-pronunciation without phoneme time alignment

arXiv - AI · 3 min ·
Previous Page 9 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime