AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

If AI is really making us more productive... why does it feel like we are working more, not less...?

The promise of AI was the ultimate system optimisation: Efficiency. On paper, the tools are delivering something similar to what they pro...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Ai Infrastructure

[P] Built an open source tool to find the location of any street picture

Hey guys, Thank you so much for your love and support regarding Netryx Astra V2 last time. Many people are not that technically savvy to ...

Reddit - Machine Learning · 1 min · about 7 hours ago

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 11 hours ago

All Content

Llms

[2603.21465] DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

Abstract page for arXiv paper 2603.21465: DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

arXiv - Machine Learning · 4 min · 6 days ago

Llms

[2603.21389] Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

Abstract page for arXiv paper 2603.21389: Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.21330] FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading

Abstract page for arXiv paper 2603.21330: FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading

arXiv - Machine Learning · 3 min · 6 days ago

Llms

[2603.21033] TabPFN Extensions for Interpretable Geotechnical Modelling

Abstract page for arXiv paper 2603.21033: TabPFN Extensions for Interpretable Geotechnical Modelling

arXiv - Machine Learning · 4 min · 6 days ago

Llms

[2603.20975] DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

Abstract page for arXiv paper 2603.20975: DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.20927] Active Inference for Physical AI Agents -- An Engineering Perspective

Abstract page for arXiv paper 2603.20927: Active Inference for Physical AI Agents -- An Engineering Perspective

arXiv - Machine Learning · 4 min · 6 days ago

Machine Learning

[2603.20929] Stability of Sequential and Parallel Coordinate Ascent Variational Inference

Abstract page for arXiv paper 2603.20929: Stability of Sequential and Parallel Coordinate Ascent Variational Inference

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.20711] RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

Abstract page for arXiv paper 2603.20711: RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.20520] CogFormer: Learn All Your Models Once

Abstract page for arXiv paper 2603.20520: CogFormer: Learn All Your Models Once

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.20421] Hawkeye: Reproducing GPU-Level Non-Determinism

Abstract page for arXiv paper 2603.20421: Hawkeye: Reproducing GPU-Level Non-Determinism

arXiv - Machine Learning · 3 min · 6 days ago

Llms

[2603.20314] VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

Abstract page for arXiv paper 2603.20314: VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.20283] FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing

Abstract page for arXiv paper 2603.20283: FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing

arXiv - Machine Learning · 3 min · 6 days ago

Llms

[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems

Abstract page for arXiv paper 2603.20218: An experimental study of KV cache reuse strategies in chunk-level caching systems

arXiv - Machine Learning · 3 min · 6 days ago

Llms

[2603.20215] Multi-Agent Debate with Memory Masking

Abstract page for arXiv paper 2603.20215: Multi-Agent Debate with Memory Masking

arXiv - Machine Learning · 4 min · 6 days ago

Ai Infrastructure

[2510.03367] Viability-Preserving Passive Torque Control

Abstract page for arXiv paper 2510.03367: Viability-Preserving Passive Torque Control

arXiv - Machine Learning · 3 min · 6 days ago

Llms

[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Abstract page for arXiv paper 2603.22206: Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

arXiv - Machine Learning · 4 min · 6 days ago

Llms

[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Abstract page for arXiv paper 2603.22184: Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

arXiv - Machine Learning · 4 min · 6 days ago

Llms

[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior

Abstract page for arXiv paper 2603.22161: Causal Evidence that Language Models use Confidence to Drive Behavior

arXiv - Machine Learning · 4 min · 6 days ago

Machine Learning

[2603.22030] On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

Abstract page for arXiv paper 2603.22030: On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

arXiv - Machine Learning · 3 min · 6 days ago

Machine Learning

[2603.21908] SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

Abstract page for arXiv paper 2603.21908: SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

arXiv - Machine Learning · 4 min · 6 days ago

Previous Page 13 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

If AI is really making us more productive... why does it feel like we are working more, not less...?

[P] Built an open source tool to find the location of any street picture

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

All Content

[2603.21465] DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

[2603.21389] Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

[2603.21330] FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading

[2603.21033] TabPFN Extensions for Interpretable Geotechnical Modelling

[2603.20975] DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

[2603.20927] Active Inference for Physical AI Agents -- An Engineering Perspective

[2603.20929] Stability of Sequential and Parallel Coordinate Ascent Variational Inference

[2603.20711] RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

[2603.20520] CogFormer: Learn All Your Models Once

[2603.20421] Hawkeye: Reproducing GPU-Level Non-Determinism

[2603.20314] VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

[2603.20283] FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing

[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems

[2603.20215] Multi-Agent Debate with Memory Masking

[2510.03367] Viability-Preserving Passive Torque Control

[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior

[2603.22030] On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

[2603.21908] SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

Related Topics

Stay updated with AI News