Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 4 hours ago

Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min · about 4 hours ago

Machine Learning

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

Hub Group says it’s using artificial intelligence and machine learning to leverage data from its GPS-equipped container fleet to give cus...

AI Events · 4 min · about 4 hours ago

All Content

Llms

[2603.25253] MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation

Abstract page for arXiv paper 2603.25253: MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Eluci...

arXiv - AI · 4 min · 2 days ago

Machine Learning

[2603.25397] A Causal Framework for Evaluating ICU Discharge Strategies

Abstract page for arXiv paper 2603.25397: A Causal Framework for Evaluating ICU Discharge Strategies

arXiv - AI · 3 min · 2 days ago

Llms

[2603.25374] Supercharging Federated Intelligence Retrieval

Abstract page for arXiv paper 2603.25374: Supercharging Federated Intelligence Retrieval

arXiv - Machine Learning · 3 min · 2 days ago

Machine Learning

[2603.25247] FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

Abstract page for arXiv paper 2603.25247: FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

arXiv - AI · 4 min · 2 days ago

Llms

[2603.25243] FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA

Abstract page for arXiv paper 2603.25243: FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA

arXiv - AI · 3 min · 2 days ago

Machine Learning

[2603.25311] Practical Efficient Global Optimization is No-regret

Abstract page for arXiv paper 2603.25311: Practical Efficient Global Optimization is No-regret

arXiv - Machine Learning · 3 min · 2 days ago

Llms

[2603.25226] WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

Abstract page for arXiv paper 2603.25226: WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

arXiv - AI · 4 min · 2 days ago

Machine Learning

[2603.25216] A Wireless World Model for AI-Native 6G Networks

Abstract page for arXiv paper 2603.25216: A Wireless World Model for AI-Native 6G Networks

arXiv - AI · 3 min · 2 days ago

Machine Learning

[2603.25257] Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

Abstract page for arXiv paper 2603.25257: Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

arXiv - Machine Learning · 3 min · 2 days ago

Machine Learning

[2603.25209] Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

Abstract page for arXiv paper 2603.25209: Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

arXiv - AI · 4 min · 2 days ago

Llms

[2603.25196] A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations

Abstract page for arXiv paper 2603.25196: A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence ...

arXiv - AI · 4 min · 2 days ago

Machine Learning

[2603.25251] Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

Abstract page for arXiv paper 2603.25251: Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

arXiv - AI · 4 min · 2 days ago

Llms

[2603.25187] Probing the Lack of Stable Internal Beliefs in LLMs

Abstract page for arXiv paper 2603.25187: Probing the Lack of Stable Internal Beliefs in LLMs

arXiv - AI · 3 min · 2 days ago

Machine Learning

[2603.25229] An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machine Learning Models

Abstract page for arXiv paper 2603.25229: An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machin...

arXiv - Machine Learning · 4 min · 2 days ago

Llms

[2603.25250] Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models

Abstract page for arXiv paper 2603.25250: Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language ...

arXiv - AI · 4 min · 2 days ago

Machine Learning

[2603.25170] Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

Abstract page for arXiv paper 2603.25170: Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

arXiv - AI · 4 min · 2 days ago

Llms

[2603.25164] PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

Abstract page for arXiv paper 2603.25164: PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented ...

arXiv - AI · 4 min · 2 days ago

Llms

[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment

Abstract page for arXiv paper 2603.25145: Learning to Rank Caption Chains for Video-Text Alignment

arXiv - Machine Learning · 3 min · 2 days ago

Llms

[2603.25155] Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

Abstract page for arXiv paper 2603.25155: Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

arXiv - AI · 3 min · 2 days ago

Machine Learning

[2603.25150] Goodness-of-pronunciation without phoneme time alignment

Abstract page for arXiv paper 2603.25150: Goodness-of-pronunciation without phoneme time alignment

arXiv - AI · 3 min · 2 days ago

Previous Page 9 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

Top 10 AI certifications and courses for 2026

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

All Content

[2603.25253] MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation

[2603.25397] A Causal Framework for Evaluating ICU Discharge Strategies

[2603.25374] Supercharging Federated Intelligence Retrieval

[2603.25247] FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics

[2603.25243] FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA

[2603.25311] Practical Efficient Global Optimization is No-regret

[2603.25226] WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing

[2603.25216] A Wireless World Model for AI-Native 6G Networks

[2603.25257] Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening

[2603.25209] Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction

[2603.25196] A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations

[2603.25251] Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding

[2603.25187] Probing the Lack of Stable Internal Beliefs in LLMs

[2603.25229] An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machine Learning Models

[2603.25250] Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models

[2603.25170] Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling

[2603.25164] PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems

[2603.25145] Learning to Rank Caption Chains for Video-Text Alignment

[2603.25155] Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models

[2603.25150] Goodness-of-pronunciation without phoneme time alignment

Related Topics

Stay updated with AI News