AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 15 minutes ago

Machine Learning

ML/AI Engineer laid off from big tech, need your help!

I recently left a very toxic company that was taking a serious toll on my mental and physical health. I gave everything I had and it cost...

Reddit - ML Jobs · 1 min · about 3 hours ago

Machine Learning

Trials and tribulations fine-tuning & deploying Gemma-4 [P]

Hey all, Our ML team spent some time this week getting training and deployments working for Gemma-4, and wanted to document all the thing...

Reddit - Machine Learning · 1 min · about 8 hours ago

All Content

Machine Learning

[2502.15110] Variational phylogenetic inference with products over bipartitions

This paper presents a novel variational Bayesian method for inferring ultrametric phylogenetic trees, improving accuracy and efficiency i...

arXiv - Machine Learning · 3 min · 2 months ago

Machine Learning

[2409.16407] Towards Representation Learning for Weighting Problems in Design-Based Causal Inference

This article explores the use of representation learning to improve weighting methods in design-based causal inference, addressing challe...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2312.17111] Online Tensor Inference

The paper presents a novel framework for online tensor inference, addressing the challenges of real-time data processing in applications ...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.09127] Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

This paper explores the concept of 'epistemic throughput' in attention-constrained inference, analyzing how generative AI systems can man...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.07744] Riemannian MeanFlow

The paper introduces Riemannian MeanFlow (RMF), a novel framework for generative modeling on Riemannian manifolds, significantly reducing...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.07263] tLoRA: Efficient Multi-LoRA Training with Elastic Shared Super-Models

The paper introduces tLoRA, a framework designed for efficient multi-LoRA training of large language models, improving training throughpu...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2512.09654] Membership and Dataset Inference Attacks on Large Audio Generative Models

This paper explores membership and dataset inference attacks on large audio generative models, assessing their implications for copyright...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2510.11834] Don't Walk the Line: Boundary Guidance for Filtered Generation

The paper presents Boundary Guidance, a reinforcement learning method designed to improve the safety and utility of generative models by ...

arXiv - Machine Learning · 3 min · 2 months ago

Machine Learning

[2509.10406] Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining

The paper introduces Multipole Semantic Attention (MuSe), a method that accelerates pretraining of transformers on long sequences by 36% ...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2508.07675] Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation

This article presents a framework for semantic caching in large language models (LLMs) to reduce inference costs by leveraging semantic s...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2508.01669] Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

This paper presents a novel model-heterogeneous federated learning framework that enhances generalization performance for clients with di...

arXiv - Machine Learning · 4 min · 2 months ago

Generative Ai

[2508.01504] Instruction-based Time Series Editing

The paper introduces Instruction-based Time Series Editing, a novel approach that allows users to modify time series data using natural l...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2507.09043] GAGA: Gaussianity-Aware Gaussian Approximation for Efficient 3D Molecular Generation

The paper presents GAGA, a method enhancing the efficiency of 3D molecular generation by leveraging Gaussian approximations, improving bo...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2505.22650] On Learning Verifiers and Implications to Chain-of-Thought Reasoning

This paper explores learning verifiers for Chain-of-Thought reasoning in natural language, addressing the challenges of incorrect inferen...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2505.12988] Optimal Formats for Weight Quantisation

This paper presents a systematic framework for designing weight quantisation formats in deep learning, demonstrating that variable-length...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2503.03704] Memory Injection Attacks on LLM Agents via Query-Only Interaction

The paper discusses Memory Injection Attacks (MINJA) on LLM agents, demonstrating how attackers can manipulate agent memory through query...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.13168] Realistic Face Reconstruction from Facial Embeddings via Diffusion Models

This paper presents a novel framework for reconstructing realistic high-resolution face images from facial embeddings using diffusion mod...

arXiv - Machine Learning · 3 min · 2 months ago

Machine Learning

[2602.13024] FedHENet: A Frugal Federated Learning Framework for Heterogeneous Environments

FedHENet introduces a frugal federated learning framework that enhances energy efficiency and stability in heterogeneous environments whi...

arXiv - Machine Learning · 3 min · 2 months ago

Ai Infrastructure

[2602.12904] Nonparametric Contextual Online Bilateral Trade

This paper explores nonparametric contextual online bilateral trade, presenting an algorithm that optimizes trade pricing based on contex...

arXiv - Machine Learning · 3 min · 2 months ago

Machine Learning

[2602.12923] Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

This article presents a theoretical analysis of how annealing strategies can mitigate mode collapse in variational inference, particularl...

arXiv - Machine Learning · 3 min · 2 months ago

Previous Page 177 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

ML/AI Engineer laid off from big tech, need your help!

Trials and tribulations fine-tuning & deploying Gemma-4 [P]

All Content

[2502.15110] Variational phylogenetic inference with products over bipartitions

[2409.16407] Towards Representation Learning for Weighting Problems in Design-Based Causal Inference

[2312.17111] Online Tensor Inference

[2602.09127] Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

[2602.07744] Riemannian MeanFlow

[2602.07263] tLoRA: Efficient Multi-LoRA Training with Elastic Shared Super-Models

[2512.09654] Membership and Dataset Inference Attacks on Large Audio Generative Models

[2510.11834] Don't Walk the Line: Boundary Guidance for Filtered Generation

[2509.10406] Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining

[2508.07675] Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation

[2508.01669] Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

[2508.01504] Instruction-based Time Series Editing

[2507.09043] GAGA: Gaussianity-Aware Gaussian Approximation for Efficient 3D Molecular Generation

[2505.22650] On Learning Verifiers and Implications to Chain-of-Thought Reasoning

[2505.12988] Optimal Formats for Weight Quantisation

[2503.03704] Memory Injection Attacks on LLM Agents via Query-Only Interaction

[2602.13168] Realistic Face Reconstruction from Facial Embeddings via Diffusion Models

[2602.13024] FedHENet: A Frugal Federated Learning Framework for Heterogeneous Environments

[2602.12904] Nonparametric Contextual Online Bilateral Trade

[2602.12923] Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

Related Topics

Stay updated with AI News