AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Llms

If AI is really making us more productive... why does it feel like we are working more, not less...?

The promise of AI was the ultimate system optimisation: Efficiency. On paper, the tools are delivering something similar to what they pro...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] Built an open source tool to find the location of any street picture

Hey guys, Thank you so much for your love and support regarding Netryx Astra V2 last time. Many people are not that technically savvy to ...

Reddit - Machine Learning · 1 min ·
Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·

All Content

[2603.21465] DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation
Llms

[2603.21465] DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

Abstract page for arXiv paper 2603.21465: DRTriton: Large-Scale Synthetic Data Reinforcement Learning for Triton Kernel Generation

arXiv - Machine Learning · 4 min ·
[2603.21389] Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
Llms

[2603.21389] Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

Abstract page for arXiv paper 2603.21389: Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models

arXiv - Machine Learning · 3 min ·
[2603.21330] FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading
Machine Learning

[2603.21330] FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading

Abstract page for arXiv paper 2603.21330: FinRL-X: An AI-Native Modular Infrastructure for Quantitative Trading

arXiv - Machine Learning · 3 min ·
[2603.21033] TabPFN Extensions for Interpretable Geotechnical Modelling
Llms

[2603.21033] TabPFN Extensions for Interpretable Geotechnical Modelling

Abstract page for arXiv paper 2603.21033: TabPFN Extensions for Interpretable Geotechnical Modelling

arXiv - Machine Learning · 4 min ·
[2603.20975] DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles
Llms

[2603.20975] DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

Abstract page for arXiv paper 2603.20975: DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles

arXiv - Machine Learning · 3 min ·
[2603.20927] Active Inference for Physical AI Agents -- An Engineering Perspective
Machine Learning

[2603.20927] Active Inference for Physical AI Agents -- An Engineering Perspective

Abstract page for arXiv paper 2603.20927: Active Inference for Physical AI Agents -- An Engineering Perspective

arXiv - Machine Learning · 4 min ·
[2603.20929] Stability of Sequential and Parallel Coordinate Ascent Variational Inference
Machine Learning

[2603.20929] Stability of Sequential and Parallel Coordinate Ascent Variational Inference

Abstract page for arXiv paper 2603.20929: Stability of Sequential and Parallel Coordinate Ascent Variational Inference

arXiv - Machine Learning · 3 min ·
[2603.20711] RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models
Machine Learning

[2603.20711] RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

Abstract page for arXiv paper 2603.20711: RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

arXiv - Machine Learning · 3 min ·
[2603.20520] CogFormer: Learn All Your Models Once
Machine Learning

[2603.20520] CogFormer: Learn All Your Models Once

Abstract page for arXiv paper 2603.20520: CogFormer: Learn All Your Models Once

arXiv - Machine Learning · 3 min ·
[2603.20421] Hawkeye: Reproducing GPU-Level Non-Determinism
Machine Learning

[2603.20421] Hawkeye: Reproducing GPU-Level Non-Determinism

Abstract page for arXiv paper 2603.20421: Hawkeye: Reproducing GPU-Level Non-Determinism

arXiv - Machine Learning · 3 min ·
[2603.20314] VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs
Llms

[2603.20314] VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

Abstract page for arXiv paper 2603.20314: VGS-Decoding: Visual Grounding Score Guided Decoding for Hallucination Mitigation in Medical VLMs

arXiv - Machine Learning · 3 min ·
[2603.20283] FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing
Machine Learning

[2603.20283] FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing

Abstract page for arXiv paper 2603.20283: FastPFRec: A Fast Personalized Federated Recommendation with Secure Sharing

arXiv - Machine Learning · 3 min ·
[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems
Llms

[2603.20218] An experimental study of KV cache reuse strategies in chunk-level caching systems

Abstract page for arXiv paper 2603.20218: An experimental study of KV cache reuse strategies in chunk-level caching systems

arXiv - Machine Learning · 3 min ·
[2603.20215] Multi-Agent Debate with Memory Masking
Llms

[2603.20215] Multi-Agent Debate with Memory Masking

Abstract page for arXiv paper 2603.20215: Multi-Agent Debate with Memory Masking

arXiv - Machine Learning · 4 min ·
[2510.03367] Viability-Preserving Passive Torque Control
Ai Infrastructure

[2510.03367] Viability-Preserving Passive Torque Control

Abstract page for arXiv paper 2510.03367: Viability-Preserving Passive Torque Control

arXiv - Machine Learning · 3 min ·
[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs
Llms

[2603.22206] Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

Abstract page for arXiv paper 2603.22206: Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs

arXiv - Machine Learning · 4 min ·
[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?
Llms

[2603.22184] Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

Abstract page for arXiv paper 2603.22184: Revisiting Quantum Code Generation: Where Should Domain Knowledge Live?

arXiv - Machine Learning · 4 min ·
[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior
Llms

[2603.22161] Causal Evidence that Language Models use Confidence to Drive Behavior

Abstract page for arXiv paper 2603.22161: Causal Evidence that Language Models use Confidence to Drive Behavior

arXiv - Machine Learning · 4 min ·
[2603.22030] On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors
Machine Learning

[2603.22030] On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

Abstract page for arXiv paper 2603.22030: On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors

arXiv - Machine Learning · 3 min ·
[2603.21908] SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference
Machine Learning

[2603.21908] SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

Abstract page for arXiv paper 2603.21908: SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference

arXiv - Machine Learning · 4 min ·
Previous Page 13 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime