AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Llms

If AI is really making us more productive... why does it feel like we are working more, not less...?

The promise of AI was the ultimate system optimisation: Efficiency. On paper, the tools are delivering something similar to what they pro...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] Built an open source tool to find the location of any street picture

Hey guys, Thank you so much for your love and support regarding Netryx Astra V2 last time. Many people are not that technically savvy to ...

Reddit - Machine Learning · 1 min ·
Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·

All Content

[2603.21705] Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs
Llms

[2603.21705] Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

Abstract page for arXiv paper 2603.21705: Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs

arXiv - Machine Learning · 4 min ·
[2603.21656] TrustFed: Enabling Trustworthy Medical AI under Data Privacy Constraints
Machine Learning

[2603.21656] TrustFed: Enabling Trustworthy Medical AI under Data Privacy Constraints

Abstract page for arXiv paper 2603.21656: TrustFed: Enabling Trustworthy Medical AI under Data Privacy Constraints

arXiv - Machine Learning · 4 min ·
[2603.21596] In-network Attack Detection with Federated Deep Learning in IoT Networks: Real Implementation and Analysis
Machine Learning

[2603.21596] In-network Attack Detection with Federated Deep Learning in IoT Networks: Real Implementation and Analysis

Abstract page for arXiv paper 2603.21596: In-network Attack Detection with Federated Deep Learning in IoT Networks: Real Implementation a...

arXiv - Machine Learning · 3 min ·
[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy
Llms

[2603.21567] Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

Abstract page for arXiv paper 2603.21567: Kolmogorov Complexity Bounds for LLM Steganography and a Perplexity-Based Detection Proxy

arXiv - Machine Learning · 3 min ·
[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference
Llms

[2603.21365] TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

Abstract page for arXiv paper 2603.21365: TIDE: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference

arXiv - Machine Learning · 4 min ·
[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project
Llms

[2603.21354] The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

Abstract page for arXiv paper 2603.21354: The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the v...

arXiv - Machine Learning · 4 min ·
[2603.21331] AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search
Machine Learning

[2603.21331] AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

Abstract page for arXiv paper 2603.21331: AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search

arXiv - Machine Learning · 4 min ·
[2603.21319] Active Inference Agency Formalization, Metrics, and Convergence Assessments
Machine Learning

[2603.21319] Active Inference Agency Formalization, Metrics, and Convergence Assessments

Abstract page for arXiv paper 2603.21319: Active Inference Agency Formalization, Metrics, and Convergence Assessments

arXiv - Machine Learning · 4 min ·
[2603.21308] Direct Interval Propagation Methods using Neural-Network Surrogates for Uncertainty Quantification in Physical Systems Surrogate Model
Machine Learning

[2603.21308] Direct Interval Propagation Methods using Neural-Network Surrogates for Uncertainty Quantification in Physical Systems Surrogate Model

Abstract page for arXiv paper 2603.21308: Direct Interval Propagation Methods using Neural-Network Surrogates for Uncertainty Quantificat...

arXiv - Machine Learning · 4 min ·
[2603.21244] Amortized Variational Inference for Logistic Regression with Missing Covariates
Machine Learning

[2603.21244] Amortized Variational Inference for Logistic Regression with Missing Covariates

Abstract page for arXiv paper 2603.21244: Amortized Variational Inference for Logistic Regression with Missing Covariates

arXiv - Machine Learning · 4 min ·
[2603.21105] ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models
Llms

[2603.21105] ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Language Models

Abstract page for arXiv paper 2603.21105: ResPrune: Text-Conditioned Subspace Reconstruction for Visual Token Pruning in Large Vision-Lan...

arXiv - Machine Learning · 4 min ·
[2603.20908] Bayesian Scattering: A Principled Baseline for Uncertainty on Image Data
Machine Learning

[2603.20908] Bayesian Scattering: A Principled Baseline for Uncertainty on Image Data

Abstract page for arXiv paper 2603.20908: Bayesian Scattering: A Principled Baseline for Uncertainty on Image Data

arXiv - Machine Learning · 3 min ·
[2603.20842] A Knowledge-Informed Pretrained Model for Causal Discovery
Machine Learning

[2603.20842] A Knowledge-Informed Pretrained Model for Causal Discovery

Abstract page for arXiv paper 2603.20842: A Knowledge-Informed Pretrained Model for Causal Discovery

arXiv - Machine Learning · 3 min ·
[2603.20829] Beyond the Academic Monoculture: A Unified Framework and Industrial Perspective for Attributed Graph Clustering
Machine Learning

[2603.20829] Beyond the Academic Monoculture: A Unified Framework and Industrial Perspective for Attributed Graph Clustering

Abstract page for arXiv paper 2603.20829: Beyond the Academic Monoculture: A Unified Framework and Industrial Perspective for Attributed ...

arXiv - Machine Learning · 4 min ·
[2603.20746] Adversarial Attacks on Locally Private Graph Neural Networks
Machine Learning

[2603.20746] Adversarial Attacks on Locally Private Graph Neural Networks

Abstract page for arXiv paper 2603.20746: Adversarial Attacks on Locally Private Graph Neural Networks

arXiv - Machine Learning · 3 min ·
[2603.20616] Beyond Token Eviction: Mixed-Dimension Budget Allocation for Efficient KV Cache Compression
Machine Learning

[2603.20616] Beyond Token Eviction: Mixed-Dimension Budget Allocation for Efficient KV Cache Compression

Abstract page for arXiv paper 2603.20616: Beyond Token Eviction: Mixed-Dimension Budget Allocation for Efficient KV Cache Compression

arXiv - Machine Learning · 3 min ·
[2603.20492] AE-LLM: Adaptive Efficiency Optimization for Large Language Models
Llms

[2603.20492] AE-LLM: Adaptive Efficiency Optimization for Large Language Models

Abstract page for arXiv paper 2603.20492: AE-LLM: Adaptive Efficiency Optimization for Large Language Models

arXiv - Machine Learning · 4 min ·
[2603.17775] CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution
Llms

[2603.17775] CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution

Abstract page for arXiv paper 2603.17775: CoVerRL: Breaking the Consensus Trap in Label-Free Reasoning via Generator-Verifier Co-Evolution

arXiv - Machine Learning · 4 min ·
[2603.16960] Adversarial attacks against Modern Vision-Language Models
Llms

[2603.16960] Adversarial attacks against Modern Vision-Language Models

Abstract page for arXiv paper 2603.16960: Adversarial attacks against Modern Vision-Language Models

arXiv - AI · 3 min ·
[2603.14635] Compute Allocation for Reasoning-Intensive Retrieval Agents
Llms

[2603.14635] Compute Allocation for Reasoning-Intensive Retrieval Agents

Abstract page for arXiv paper 2603.14635: Compute Allocation for Reasoning-Intensive Retrieval Agents

arXiv - AI · 3 min ·
Previous Page 14 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime