AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

[R] Fine-tuning services report

If you have some data and want to train or run a small custom model but don't have powerful enough hardware for training, fine-tuning ser...

Reddit - Machine Learning · 1 min ·
Machine Learning

The AI Chip War is Just Getting Started

Everyone talks about AI models, but the real bottleneck might be hardware. According to a recent study by Roots Analysis: AI chip market ...

Reddit - Artificial Intelligence · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·

All Content

[2603.02788] Agentified Assessment of Logical Reasoning Agents
Ai Infrastructure

[2603.02788] Agentified Assessment of Logical Reasoning Agents

Abstract page for arXiv paper 2603.02788: Agentified Assessment of Logical Reasoning Agents

arXiv - AI · 3 min ·
[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving
Llms

[2603.02599] SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

Abstract page for arXiv paper 2603.02599: SUN: Shared Use of Next-token Prediction for Efficient Multi-LLM Disaggregated Serving

arXiv - Machine Learning · 4 min ·
[2603.02237] Concept Heterogeneity-aware Representation Steering
Llms

[2603.02237] Concept Heterogeneity-aware Representation Steering

Abstract page for arXiv paper 2603.02237: Concept Heterogeneity-aware Representation Steering

arXiv - AI · 4 min ·
[2603.02236] CUDABench: Benchmarking LLMs for Text-to-CUDA Generation
Llms

[2603.02236] CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

Abstract page for arXiv paper 2603.02236: CUDABench: Benchmarking LLMs for Text-to-CUDA Generation

arXiv - AI · 3 min ·
[2603.02235] Talking with Verifiers: Automatic Specification Generation for Neural Network Verification
Machine Learning

[2603.02235] Talking with Verifiers: Automatic Specification Generation for Neural Network Verification

Abstract page for arXiv paper 2603.02235: Talking with Verifiers: Automatic Specification Generation for Neural Network Verification

arXiv - AI · 4 min ·
[2603.02479] PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference
Machine Learning

[2603.02479] PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference

Abstract page for arXiv paper 2603.02479: PRISM: Pushing the Frontier of Deep Think via Process Reward Model-Guided Inference

arXiv - AI · 4 min ·
[2603.02230] Generalized Discrete Diffusion with Self-Correction
Machine Learning

[2603.02230] Generalized Discrete Diffusion with Self-Correction

Abstract page for arXiv paper 2603.02230: Generalized Discrete Diffusion with Self-Correction

arXiv - AI · 3 min ·
[2603.02240] SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning
Llms

[2603.02240] SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Poisoning

Abstract page for arXiv paper 2603.02240: SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Mem...

arXiv - AI · 3 min ·
[2603.02214] Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving
Machine Learning

[2603.02214] Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

Abstract page for arXiv paper 2603.02214: Federated Inference: Toward Privacy-Preserving Collaborative and Incentivized Model Serving

arXiv - Machine Learning · 3 min ·
[2603.02217] Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression
Machine Learning

[2603.02217] Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

Abstract page for arXiv paper 2603.02217: Is Retraining-Free Enough? The Necessity of Router Calibration for Efficient MoE Compression

arXiv - AI · 3 min ·
Llms

[D] Quantified analysis of 2,218 Gary Marcus claims - two independent LLM pipelines, scored against evidence

Built a dataset scoring every testable claim from Marcus's 474 Substack posts. Two pipelines (Claude Opus 4.6 and ChatGPT Codex) analyzed...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] We made GoodSeed, a pleasant ML experiment tracker

GoodSeed v0.3.0 🎉 I and my friend are pleased to announce GoodSeed - a ML experiment tracker which we are now using as a replacement for ...

Reddit - Machine Learning · 1 min ·
Llms

[D] Predicting total cost of agentic LLM workflows - is there a research gap around output token count and chain depth estimation?

Working on a practical problem that I think has an interesting ML angle. In agentic LLM workflows (tool use, multi-step reasoning, ReAct-...

Reddit - Machine Learning · 1 min ·
LLMs can unmask pseudonymous users at scale with surprising accuracy - Ars Technica
Llms

LLMs can unmask pseudonymous users at scale with surprising accuracy - Ars Technica

Pseudonymity has never been perfect for preserving privacy. Soon it may be pointless.

Ars Technica - AI · 7 min ·
Machine Learning

[P] On-device Qwen3-TTS (1.7B/0.6B) inference on iOS and macOS via MLX-Swift — voice cloning, voice design, and streaming TTS with no cloud

Hey r/MachineLearning. I'm a solo dev working on on-device TTS using MLX-Swift with Qwen3-TTS. 1.7B model on macOS, 0.6B on iOS, quantize...

Reddit - Machine Learning · 1 min ·
Machine Learning

[R] To the Women of Machine Learning - I'm Hiring!

It's not a secret that ML Engineers are predominantly men. Still, as I work to build a foundational ML team, I am being intentional about...

Reddit - Machine Learning · 1 min ·
[2511.01266] MotionStream: Real-Time Video Generation with Interactive Motion Controls
Machine Learning

[2511.01266] MotionStream: Real-Time Video Generation with Interactive Motion Controls

Abstract page for arXiv paper 2511.01266: MotionStream: Real-Time Video Generation with Interactive Motion Controls

arXiv - Machine Learning · 4 min ·
[2510.13849] Language steering in latent space to mitigate unintended code-switching
Llms

[2510.13849] Language steering in latent space to mitigate unintended code-switching

Abstract page for arXiv paper 2510.13849: Language steering in latent space to mitigate unintended code-switching

arXiv - Machine Learning · 3 min ·
[2509.22459] Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)
Machine Learning

[2509.22459] Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

Abstract page for arXiv paper 2509.22459: Universal Inverse Distillation for Matching Models with Real-Data Supervision (No GANs)

arXiv - Machine Learning · 4 min ·
[2509.21764] CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones
Nlp

[2509.21764] CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones

Abstract page for arXiv paper 2509.21764: CubistMerge: Spatial-Preserving Token Merging For Diverse ViT Backbones

arXiv - Machine Learning · 4 min ·
Previous Page 45 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime