AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min · about 3 hours ago

Ai Infrastructure

OpenAI’s Fidji Simo Is Taking Medical Leave Amid an Executive Shake-Up | WIRED

The company is undergoing major leadership restructuring as its CEO of AGI deployment goes on leave for “several weeks.”

Wired - AI · 5 min · about 6 hours ago

All Content

Machine Learning

[2602.23903] SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

Abstract page for arXiv paper 2602.23903: SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmenta...

arXiv - Machine Learning · 3 min · about 1 month ago

$[2309.10370] Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization$

Machine Learning

[2309.10370] Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization

Abstract page for arXiv paper 2309.10370: Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.23561] VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

Abstract page for arXiv paper 2602.23561: VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2509.20067] MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM

Abstract page for arXiv paper 2509.20067: MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM

arXiv - AI · 4 min · about 1 month ago

Llms

[2507.19364] Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges

Abstract page for arXiv paper 2507.19364: Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.24286] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Abstract page for arXiv paper 2602.24286: CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.24251] Histopathology Image Normalization via Latent Manifold Compaction

Abstract page for arXiv paper 2602.24251: Histopathology Image Normalization via Latent Manifold Compaction

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.24231] Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

Abstract page for arXiv paper 2602.24231: Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.24044] Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

Abstract page for arXiv paper 2602.24044: Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.24083] Neural Diffusion Intensity Models for Point Process Data

Abstract page for arXiv paper 2602.24083: Neural Diffusion Intensity Models for Point Process Data

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Infrastructure

[2602.23935] Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

Abstract page for arXiv paper 2602.23935: Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.23949] HotelQuEST: Balancing Quality and Efficiency in Agentic Search

Abstract page for arXiv paper 2602.23949: HotelQuEST: Balancing Quality and Efficiency in Agentic Search

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.24066] pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures

Abstract page for arXiv paper 2602.24066: pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Infrastructure

[2602.23994] MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening

Abstract page for arXiv paper 2602.23994: MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.23968] Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference

Abstract page for arXiv paper 2602.23968: Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.23881] LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

Abstract page for arXiv paper 2602.23881: LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Infrastructure

[2602.23800] Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints

Abstract page for arXiv paper 2602.23800: Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.23798] MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

Abstract page for arXiv paper 2602.23798: MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.23722] SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud

Abstract page for arXiv paper 2602.23722: SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.23795] GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

Abstract page for arXiv paper 2602.23795: GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 65 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

OpenAI’s Fidji Simo Is Taking Medical Leave Amid an Executive Shake-Up | WIRED

All Content

[2602.23903] SegMate: Asymmetric Attention-Based Lightweight Architecture for Efficient Multi-Organ Segmentation

[2309.10370] Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization

[2602.23561] VaSST: Variational Inference for Symbolic Regression using Soft Symbolic Trees

[2509.20067] MACD: Multi-Agent Clinical Diagnosis with Self-Learned Knowledge for LLM

[2507.19364] Integrating LLM in Agent-Based Social Simulation: Opportunities and Challenges

[2602.24286] CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

[2602.24251] Histopathology Image Normalization via Latent Manifold Compaction

[2602.24231] Adaptive Combinatorial Experimental Design: Pareto Optimality for Decision-Making and Inference

[2602.24044] Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

[2602.24083] Neural Diffusion Intensity Models for Point Process Data

[2602.23935] Green or Fast? Learning to Balance Cold Starts and Idle Carbon in Serverless Computing

[2602.23949] HotelQuEST: Balancing Quality and Efficiency in Agentic Search

[2602.24066] pathsig: A GPU-Accelerated Library for Truncated and Projected Path Signatures

[2602.23994] MINT: Multimodal Imaging-to-Speech Knowledge Transfer for Early Alzheimer's Screening

[2602.23968] Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference

[2602.23881] LK Losses: Direct Acceptance Rate Optimization for Speculative Decoding

[2602.23800] Operationalizing Longitudinal Causal Discovery Under Real-World Workflow Constraints

[2602.23798] MPU: Towards Secure and Privacy-Preserving Knowledge Unlearning for Large Language Models

[2602.23722] SLA-Aware Distributed LLM Inference Across Device-RAN-Cloud

[2602.23795] GRAIL: Post-hoc Compensation by Linear Reconstruction for Compressed Networks

Related Topics

Stay updated with AI News