AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 4 hours ago

Machine Learning

[2603.12372] Efficient Reasoning with Balanced Thinking

Abstract page for arXiv paper 2603.12372: Efficient Reasoning with Balanced Thinking

arXiv - Machine Learning · 4 min · about 5 hours ago

Machine Learning

[2510.13714] DeDelayed: Deleting Remote Inference Delay via On-Device Correction

Abstract page for arXiv paper 2510.13714: DeDelayed: Deleting Remote Inference Delay via On-Device Correction

arXiv - Machine Learning · 4 min · about 5 hours ago

All Content

Machine Learning

[2602.00640] Combinatorial Bandit Bayesian Optimization for Tensor Outputs

Abstract page for arXiv paper 2602.00640: Combinatorial Bandit Bayesian Optimization for Tensor Outputs

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2601.20088] Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery

Abstract page for arXiv paper 2601.20088: Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2601.19961] MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference

Abstract page for arXiv paper 2601.19961: MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2601.04786] AgentOCR: Reimagining Agent History via Optical Self-Compression

Abstract page for arXiv paper 2601.04786: AgentOCR: Reimagining Agent History via Optical Self-Compression

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.08616] Reasoning on Time-Series for Financial Technical Analysis

Abstract page for arXiv paper 2511.08616: Reasoning on Time-Series for Financial Technical Analysis

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2511.01191] Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

Abstract page for arXiv paper 2511.01191: Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement L...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2512.03324] Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs

Abstract page for arXiv paper 2512.03324: Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2511.19473] WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

Abstract page for arXiv paper 2511.19473: WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.18871] How Do LLMs Use Their Depth?

Abstract page for arXiv paper 2510.18871: How Do LLMs Use Their Depth?

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2510.16028] TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

Abstract page for arXiv paper 2510.16028: TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.21910] Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

Abstract page for arXiv paper 2510.21910: Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.20264] Optimistic Task Inference for Behavior Foundation Models

Abstract page for arXiv paper 2510.20264: Optimistic Task Inference for Behavior Foundation Models

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.15301] Latent Diffusion Model without Variational Autoencoder

Abstract page for arXiv paper 2510.15301: Latent Diffusion Model without Variational Autoencoder

arXiv - AI · 4 min · about 1 month ago

Llms

[2510.18245] Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs

Abstract page for arXiv paper 2510.18245: Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.09462] Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

Abstract page for arXiv paper 2510.09462: Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2510.07940] TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

Abstract page for arXiv paper 2510.07940: TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2510.07959] DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Abstract page for arXiv paper 2510.07959: DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

arXiv - Machine Learning · 4 min · about 1 month ago

Nlp

[2510.07746] t-SNE Exaggerates Clusters, Provably

Abstract page for arXiv paper 2510.07746: t-SNE Exaggerates Clusters, Provably

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2510.05109] Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices

Abstract page for arXiv paper 2510.05109: Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on B...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2510.03638] Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

Abstract page for arXiv paper 2510.03638: Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

arXiv - AI · 4 min · about 1 month ago

Previous Page 54 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

[2603.12372] Efficient Reasoning with Balanced Thinking

[2510.13714] DeDelayed: Deleting Remote Inference Delay via On-Device Correction

All Content

[2602.00640] Combinatorial Bandit Bayesian Optimization for Tensor Outputs

[2601.20088] Quantization-Aware Distillation for NVFP4 Inference Accuracy Recovery

[2601.19961] MeanCache: From Instantaneous to Average Velocity for Accelerating Flow Matching Inference

[2601.04786] AgentOCR: Reimagining Agent History via Optical Self-Compression

[2511.08616] Reasoning on Time-Series for Financial Technical Analysis

[2511.01191] Self-Harmony: Learning to Harmonize Self-Supervision and Self-Play in Test-Time Reinforcement Learning

[2512.03324] Cache What Lasts: Token Retention for Memory-Bounded KV Cache in LLMs

[2511.19473] WavefrontDiffusion: Dynamic Decoding Schedule for Improved Reasoning

[2510.18871] How Do LLMs Use Their Depth?

[2510.16028] TAO: Tolerance-Aware Optimistic Verification for Floating-Point Neural Networks

[2510.21910] Adversarial Déjà Vu: Jailbreak Dictionary Learning for Stronger Generalization to Unseen Attacks

[2510.20264] Optimistic Task Inference for Behavior Foundation Models

[2510.15301] Latent Diffusion Model without Variational Autoencoder

[2510.18245] Scaling Laws Meet Model Architecture: Toward Inference-Efficient LLMs

[2510.09462] Adaptive Attacks on Trusted Monitors Subvert AI Control Protocols

[2510.07940] TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

[2510.07959] DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

[2510.07746] t-SNE Exaggerates Clusters, Provably

[2510.05109] Tiny but Mighty: A Software-Hardware Co-Design Approach for Efficient Multimodal Inference on Battery-Powered Small Devices

[2510.03638] Expressive Power of Implicit Models: Rich Equilibria and Test-Time Scaling

Related Topics

Stay updated with AI News