AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

Why production systems keep making “correct” decisions that are no longer right [D]

I’ve been looking at a recurring failure pattern across AI systems in production. Not model failure, or data quality or infrastructure. S...

Reddit - Machine Learning · 1 min ·
Machine Learning

Free 1 year Nvidia api key

NVIDIA limited-time perk: Claim a free 1-year API Key! Hermes Agent now supports integration with the NVIDIA NIM platform, with real-worl...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Compile English function descriptions into 22 MB neural programs that run locally [P]

We built a system, ProgramAsWeights (PAW), where a neural compiler takes a plain-English function description and produces a "neural prog...

Reddit - Machine Learning · 1 min ·

All Content

[2602.12317] Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement
Llms

[2602.12317] Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement

The paper presents RaSD, a framework for pre-training medical image foundation models using synthetic data, demonstrating superior perfor...

arXiv - Machine Learning · 4 min ·
[2602.12305] OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization
Llms

[2602.12305] OptiML: An End-to-End Framework for Program Synthesis and CUDA Kernel Optimization

OptiML is a novel framework that enhances CUDA kernel optimization through program synthesis, leveraging large language models for improv...

arXiv - Machine Learning · 4 min ·
[2602.12288] Energy-Aware Reinforcement Learning for Robotic Manipulation of Articulated Components in Infrastructure Operation and Maintenance
Robotics

[2602.12288] Energy-Aware Reinforcement Learning for Robotic Manipulation of Articulated Components in Infrastructure Operation and Maintenance

This paper presents an energy-aware reinforcement learning framework for robotic manipulation of articulated components in infrastructure...

arXiv - AI · 4 min ·
[2602.12284] A Lightweight LLM Framework for Disaster Humanitarian Information Classification
Llms

[2602.12284] A Lightweight LLM Framework for Disaster Humanitarian Information Classification

This paper presents a lightweight framework for classifying humanitarian information from social media, enhancing disaster response effic...

arXiv - Machine Learning · 3 min ·
[2511.13494] Language-Guided Invariance Probing of Vision-Language Models
Llms

[2511.13494] Language-Guided Invariance Probing of Vision-Language Models

This article introduces Language-Guided Invariance Probing (LGIP), a benchmark for evaluating the robustness of vision-language models (V...

arXiv - AI · 3 min ·
[2602.12963] Information-theoretic analysis of world models in optimal reward maximizers
Machine Learning

[2602.12963] Information-theoretic analysis of world models in optimal reward maximizers

This paper presents an information-theoretic analysis of world models in optimal reward maximizers, quantifying the information conveyed ...

arXiv - AI · 3 min ·
[2602.12748] X-SYS: A Reference Architecture for Interactive Explanation Systems
Machine Learning

[2602.12748] X-SYS: A Reference Architecture for Interactive Explanation Systems

The article presents X-SYS, a reference architecture designed for interactive explanation systems in AI, addressing the challenges of dep...

arXiv - AI · 4 min ·
[2602.12852] WebClipper: Efficient Evolution of Web Agents with Graph-based Trajectory Pruning
Ai Agents

[2602.12852] WebClipper: Efficient Evolution of Web Agents with Graph-based Trajectory Pruning

WebClipper introduces a novel framework for optimizing web agent trajectories through graph-based pruning, enhancing search efficiency an...

arXiv - AI · 3 min ·
[2602.12670] SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks
Llms

[2602.12670] SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

The paper introduces SkillsBench, a benchmark assessing the effectiveness of agent skills across 86 tasks in 11 domains, revealing signif...

arXiv - AI · 4 min ·
[2602.12586] Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models
Llms

[2602.12586] Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

This paper introduces McDiffuSE, a Monte Carlo Tree Search framework aimed at optimizing slot filling orders in Masked Diffusion Models, ...

arXiv - AI · 3 min ·
Job threats, rogue bots: five hot issues in AI
Ai Safety

Job threats, rogue bots: five hot issues in AI

The article discusses five critical issues surrounding AI at the AI Impact Summit, including job displacement, rogue AI, energy demands, ...

AI Tools & Products · 5 min ·
Amazon Spends $200 Billion on AI Amid Cloud Competition
Ai Infrastructure

Amazon Spends $200 Billion on AI Amid Cloud Competition

Amazon is launching a $200 billion capital spending program focused on AI to strengthen its cloud business, AWS, amid rising competition ...

AI Tools & Products · 4 min ·
Llms

[D] Interview experience for LLM inference systems position

A user shares their preparation strategy for an interview at an AI lab focused on LLM inference systems, detailing coding and design roun...

Reddit - Machine Learning · 1 min ·
Blackstone backs Neysa in up to $1.2B financing as India pushes to build domestic AI infrastructure | TechCrunch
Ai Infrastructure

Blackstone backs Neysa in up to $1.2B financing as India pushes to build domestic AI infrastructure | TechCrunch

Neysa, an Indian AI infrastructure startup, secures up to $1.2 billion in financing from Blackstone and co-investors to expand its GPU ca...

TechCrunch - AI · 6 min ·
As AI data centers hit power limits, Peak XV backs Indian startup C2i to fix the bottleneck | TechCrunch
Ai Infrastructure

As AI data centers hit power limits, Peak XV backs Indian startup C2i to fix the bottleneck | TechCrunch

C2i, an Indian startup, has raised $15 million to develop a grid-to-GPU power solution aimed at reducing energy losses in AI data centers...

TechCrunch - AI · 6 min ·
Machine Learning

[P]ut a Neural Network in VCV Rack 2 and told it to make sounds that influence my emotion tracking module…

It decided to blow out my right headphone to make me show fear Some Background: I’m working on integrating computer vision and facial tra...

Reddit - Machine Learning · 1 min ·
How we sped up transformer inference 100x for 🤗 API customers
Open Source Ai

How we sped up transformer inference 100x for 🤗 API customers

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hugging Face Blog · 5 min ·
Hugging Face on PyTorch / XLA TPUs
Open Source Ai

Hugging Face on PyTorch / XLA TPUs

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hugging Face Blog · 10 min ·
Scaling-up BERT Inference on CPU (Part 1)
Open Source Ai

Scaling-up BERT Inference on CPU (Part 1)

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hugging Face Blog · 21 min ·
Few-shot learning in practice: GPT-Neo and the 🤗 Accelerated Inference API
Llms

Few-shot learning in practice: GPT-Neo and the 🤗 Accelerated Inference API

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Hugging Face Blog · 7 min ·
Previous Page 182 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime