AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

ML/AI Engineer laid off from big tech, need your help!

I recently left a very toxic company that was taking a serious toll on my mental and physical health. I gave everything I had and it cost...

Reddit - ML Jobs · 1 min ·
Machine Learning

Trials and tribulations fine-tuning & deploying Gemma-4 [P]

Hey all, Our ML team spent some time this week getting training and deployments working for Gemma-4, and wanted to document all the thing...

Reddit - Machine Learning · 1 min ·

All Content

[2502.15110] Variational phylogenetic inference with products over bipartitions
Machine Learning

[2502.15110] Variational phylogenetic inference with products over bipartitions

This paper presents a novel variational Bayesian method for inferring ultrametric phylogenetic trees, improving accuracy and efficiency i...

arXiv - Machine Learning · 3 min ·
[2409.16407] Towards Representation Learning for Weighting Problems in Design-Based Causal Inference
Machine Learning

[2409.16407] Towards Representation Learning for Weighting Problems in Design-Based Causal Inference

This article explores the use of representation learning to improve weighting methods in design-based causal inference, addressing challe...

arXiv - Machine Learning · 4 min ·
[2312.17111] Online Tensor Inference
Machine Learning

[2312.17111] Online Tensor Inference

The paper presents a novel framework for online tensor inference, addressing the challenges of real-time data processing in applications ...

arXiv - Machine Learning · 4 min ·
[2602.09127] Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference
Machine Learning

[2602.09127] Epistemic Throughput: Fundamental Limits of Attention-Constrained Inference

This paper explores the concept of 'epistemic throughput' in attention-constrained inference, analyzing how generative AI systems can man...

arXiv - Machine Learning · 4 min ·
[2602.07744] Riemannian MeanFlow
Machine Learning

[2602.07744] Riemannian MeanFlow

The paper introduces Riemannian MeanFlow (RMF), a novel framework for generative modeling on Riemannian manifolds, significantly reducing...

arXiv - Machine Learning · 3 min ·
[2602.07263] tLoRA: Efficient Multi-LoRA Training with Elastic Shared Super-Models
Llms

[2602.07263] tLoRA: Efficient Multi-LoRA Training with Elastic Shared Super-Models

The paper introduces tLoRA, a framework designed for efficient multi-LoRA training of large language models, improving training throughpu...

arXiv - Machine Learning · 4 min ·
[2512.09654] Membership and Dataset Inference Attacks on Large Audio Generative Models
Machine Learning

[2512.09654] Membership and Dataset Inference Attacks on Large Audio Generative Models

This paper explores membership and dataset inference attacks on large audio generative models, assessing their implications for copyright...

arXiv - Machine Learning · 4 min ·
[2510.11834] Don't Walk the Line: Boundary Guidance for Filtered Generation
Machine Learning

[2510.11834] Don't Walk the Line: Boundary Guidance for Filtered Generation

The paper presents Boundary Guidance, a reinforcement learning method designed to improve the safety and utility of generative models by ...

arXiv - Machine Learning · 3 min ·
[2509.10406] Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining
Machine Learning

[2509.10406] Multipole Semantic Attention: A Fast Approximation of Softmax Attention for Pretraining

The paper introduces Multipole Semantic Attention (MuSe), a method that accelerates pretraining of transformers on long sequences by 36% ...

arXiv - Machine Learning · 3 min ·
[2508.07675] Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation
Llms

[2508.07675] Semantic Caching for Low-Cost LLM Serving: From Offline Learning to Online Adaptation

This article presents a framework for semantic caching in large language models (LLMs) to reduce inference costs by leveraging semantic s...

arXiv - Machine Learning · 4 min ·
[2508.01669] Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models
Machine Learning

[2508.01669] Bridging Generalization Gap of Heterogeneous Federated Clients Using Generative Models

This paper presents a novel model-heterogeneous federated learning framework that enhances generalization performance for clients with di...

arXiv - Machine Learning · 4 min ·
[2508.01504] Instruction-based Time Series Editing
Generative Ai

[2508.01504] Instruction-based Time Series Editing

The paper introduces Instruction-based Time Series Editing, a novel approach that allows users to modify time series data using natural l...

arXiv - Machine Learning · 4 min ·
[2507.09043] GAGA: Gaussianity-Aware Gaussian Approximation for Efficient 3D Molecular Generation
Machine Learning

[2507.09043] GAGA: Gaussianity-Aware Gaussian Approximation for Efficient 3D Molecular Generation

The paper presents GAGA, a method enhancing the efficiency of 3D molecular generation by leveraging Gaussian approximations, improving bo...

arXiv - Machine Learning · 4 min ·
[2505.22650] On Learning Verifiers and Implications to Chain-of-Thought Reasoning
Llms

[2505.22650] On Learning Verifiers and Implications to Chain-of-Thought Reasoning

This paper explores learning verifiers for Chain-of-Thought reasoning in natural language, addressing the challenges of incorrect inferen...

arXiv - Machine Learning · 4 min ·
[2505.12988] Optimal Formats for Weight Quantisation
Machine Learning

[2505.12988] Optimal Formats for Weight Quantisation

This paper presents a systematic framework for designing weight quantisation formats in deep learning, demonstrating that variable-length...

arXiv - Machine Learning · 4 min ·
[2503.03704] Memory Injection Attacks on LLM Agents via Query-Only Interaction
Llms

[2503.03704] Memory Injection Attacks on LLM Agents via Query-Only Interaction

The paper discusses Memory Injection Attacks (MINJA) on LLM agents, demonstrating how attackers can manipulate agent memory through query...

arXiv - Machine Learning · 4 min ·
[2602.13168] Realistic Face Reconstruction from Facial Embeddings via Diffusion Models
Machine Learning

[2602.13168] Realistic Face Reconstruction from Facial Embeddings via Diffusion Models

This paper presents a novel framework for reconstructing realistic high-resolution face images from facial embeddings using diffusion mod...

arXiv - Machine Learning · 3 min ·
[2602.13024] FedHENet: A Frugal Federated Learning Framework for Heterogeneous Environments
Machine Learning

[2602.13024] FedHENet: A Frugal Federated Learning Framework for Heterogeneous Environments

FedHENet introduces a frugal federated learning framework that enhances energy efficiency and stability in heterogeneous environments whi...

arXiv - Machine Learning · 3 min ·
[2602.12904] Nonparametric Contextual Online Bilateral Trade
Ai Infrastructure

[2602.12904] Nonparametric Contextual Online Bilateral Trade

This paper explores nonparametric contextual online bilateral trade, presenting an algorithm that optimizes trade pricing based on contex...

arXiv - Machine Learning · 3 min ·
[2602.12923] Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures
Machine Learning

[2602.12923] Annealing in variational inference mitigates mode collapse: A theoretical study on Gaussian mixtures

This article presents a theoretical analysis of how annealing strategies can mitigate mode collapse in variational inference, particularl...

arXiv - Machine Learning · 3 min ·
Previous Page 177 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime