Top Machine Learning This Week

The most engaging machine learning content from this week, curated by AI News.

  1. 1

    [2605.00123] Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Models

    Abstract page for arXiv paper 2605.00123: Minimal, Local, Causal Explanations for Jailbreak Success in Large Language Models

    arXiv - AI · 7 days ago
  2. 2

    [2605.00224] TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization

    Abstract page for arXiv paper 2605.00224: TUR-DPO: Topology- and Uncertainty-Aware Direct Preference Optimization

    arXiv - AI · 7 days ago
  3. 3

    [2605.00245] ARMOR 2025: A Military-Aligned Benchmark for Evaluating Large Language Model Safety Beyond Civilian Contexts

    Abstract page for arXiv paper 2605.00245: ARMOR 2025: A Military-Aligned Benchmark for Evaluating Large Language Model Safety Beyond Civilian Contexts

    arXiv - AI · 7 days ago
  4. 4

    [2605.00300] Token Arena: A Continuous Benchmark Unifying Energy and Cognition in AI Inference

    Abstract page for arXiv paper 2605.00300: Token Arena: A Continuous Benchmark Unifying Energy and Cognition in AI Inference

    arXiv - AI · 7 days ago
  5. 5

    [2605.00334] AgentFloor: How Far Up the tool use Ladder Can Small Open-Weight Models Go?

    Abstract page for arXiv paper 2605.00334: AgentFloor: How Far Up the tool use Ladder Can Small Open-Weight Models Go?

    arXiv - AI · 7 days ago
  6. 6

    [2605.00412] Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

    Abstract page for arXiv paper 2605.00412: Physically Native World Models: A Hamiltonian Perspective on Generative World Modeling

    arXiv - AI · 7 days ago
  7. 7

    [2605.00425] AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

    Abstract page for arXiv paper 2605.00425: AEM: Adaptive Entropy Modulation for Multi-Turn Agentic Reinforcement Learning

    arXiv - AI · 7 days ago
  8. 8

    [2605.00737] To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

    Abstract page for arXiv paper 2605.00737: To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling

    arXiv - AI · 7 days ago
  9. 9

    [2605.00742] Position: agentic AI orchestration should be Bayes-consistent

    Abstract page for arXiv paper 2605.00742: Position: agentic AI orchestration should be Bayes-consistent

    arXiv - AI · 7 days ago
  10. 10

    [2604.28031] Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation

    Abstract page for arXiv paper 2604.28031: Models Recall What They Violate: Constraint Adherence in Multi-Turn LLM Ideation

    arXiv - AI · 7 days ago
  11. 11

    [2605.00005] Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

    Abstract page for arXiv paper 2605.00005: Cloud Is Closer Than It Appears: Revisiting the Tradeoffs of Distributed Real-Time Inference

    arXiv - AI · 7 days ago
  12. 12

    [2605.00007] Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents

    Abstract page for arXiv paper 2605.00007: Mean-Field Path-Integral Diffusion: From Samples to Interacting Agents

    arXiv - AI · 7 days ago
  13. 13

    [2605.00011] FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources

    Abstract page for arXiv paper 2605.00011: FedACT: Concurrent Federated Intelligence across Heterogeneous Data Sources

    arXiv - AI · 7 days ago
  14. 14

    [2605.00012] Exploring LLM biases to manipulate AI search overview

    Abstract page for arXiv paper 2605.00012: Exploring LLM biases to manipulate AI search overview

    arXiv - AI · 7 days ago
  15. 15

    [2605.00015] TimeRFT: Stimulating Generalizable Time Series Forecasting for TSFMs via Reinforcement Finetuning

    Abstract page for arXiv paper 2605.00015: TimeRFT: Stimulating Generalizable Time Series Forecasting for TSFMs via Reinforcement Finetuning

    arXiv - AI · 7 days ago
  16. 16

    [2605.00022] Putting HUMANS first: Efficient LAM Evaluation with Human Preference Alignment

    Abstract page for arXiv paper 2605.00022: Putting HUMANS first: Efficient LAM Evaluation with Human Preference Alignment

    arXiv - AI · 7 days ago
  17. 17

    [2605.00020] AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G

    Abstract page for arXiv paper 2605.00020: AirFM-DDA: Air-Interface Foundation Model in the Delay-Doppler-Angle Domain for AI-Native 6G

    arXiv - AI · 7 days ago
  18. 18

    [2605.00056] Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution

    Abstract page for arXiv paper 2605.00056: Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution

    arXiv - AI · 7 days ago
  19. 19

    [2605.00063] A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

    Abstract page for arXiv paper 2605.00063: A Survey of Reasoning-Intensive Retrieval: Progress and Challenges

    arXiv - AI · 7 days ago
  20. 20

    [2605.00060] TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data

    Abstract page for arXiv paper 2605.00060: TADI: Tool-Augmented Drilling Intelligence via Agentic LLM Orchestration over Heterogeneous Wellsite Data

    arXiv - AI · 7 days ago

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime