AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min · about 1 hour ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 2 hours ago

Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min · about 4 hours ago

All Content

Robotics

[2603.20285] AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse

Abstract page for arXiv paper 2603.20285: AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwid...

arXiv - AI · 4 min · 6 days ago

Llms

[2603.20213] AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

Abstract page for arXiv paper 2603.20213: AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

arXiv - Machine Learning · 4 min · 6 days ago

Machine Learning

LightRest Ltd's 'LAGK' Initiative - Leverage-Aware Governance Kernal

Most discussions around AI safety focus on what models know or whether outputs are correct. But since 2019, I’ve been working on somethin...

Reddit - Artificial Intelligence · 1 min · 6 days ago

Ai Infrastructure

Nvidia CEO Jensen Huang says ‘I think we’ve achieved AGI’ | The Verge

He then seemed to slightly walk back the claim.

The Verge - AI · 4 min · 7 days ago

Ai Infrastructure

Sam Altman-backed fusion startup Helion in talks to sell power to OpenAI | TechCrunch

OpenAI CEO Sam Altman is stepping down as board chair of Helion. His departure comes as reports that the two companies are negotiating a ...

TechCrunch - AI · 5 min · 7 days ago

Machine Learning

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way | TechCrunch

Gimlet Labs just raised an $80 million Series A for tech that lets AI run across NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix chips, si...

TechCrunch - AI · 5 min · 7 days ago

Ai Infrastructure

Jensen Huang compares not using AI to using "paper and pencil" to design chips, as he explains Nvidia's massive token budget

submitted by /u/Tiny-Independent273 [link] [comments]

Reddit - Artificial Intelligence · 1 min · 7 days ago

Ai Infrastructure

Sam Altman-backed fusion startup Helion in talks with OpenAI | TechCrunch

Helion is reportedly negotiating a deal that would see it sell 12.5% of its power output to OpenAI.

TechCrunch - AI · 5 min · 7 days ago

Llms

[P] no-magic: 47 AI/ML algorithms implemented from scratch in single-file, zero-dependency Python

I've been building no-magic — a collection of 47 single-file Python implementations of the algorithms behind modern AI. No PyTorch, no Te...

Reddit - Machine Learning · 1 min · 7 days ago

Machine Learning

[D] The "serverless GPU" market is getting crowded — a breakdown of how different platforms actually differ

ok so I’ve been going down a rabbit hole on this for the past few weeks for a piece I’m writing and honestly the amount of marketing BS i...

Reddit - Machine Learning · 1 min · 7 days ago

Machine Learning

3 Questions: How AI could optimize the power grid

MIT researchers explore how AI can optimize the power grid, enhancing efficiency, resilience against extreme weather, and supporting rene...

AI News - General · 9 min · 7 days ago

Llms

[2511.17885] FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and Routing-Aware Token Pruning

Abstract page for arXiv paper 2511.17885: FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and R...

arXiv - Machine Learning · 4 min · 7 days ago

Ai Infrastructure

[2409.06271] A new paradigm for global sensitivity analysis

Abstract page for arXiv paper 2409.06271: A new paradigm for global sensitivity analysis

arXiv - Machine Learning · 4 min · 7 days ago

Machine Learning

[2405.01425] In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies

Abstract page for arXiv paper 2405.01425: In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2603.18464] AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

Abstract page for arXiv paper 2603.18464: AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-...

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2602.10014] A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula

Abstract page for arXiv paper 2602.10014: A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2409.19435] Simulation-based Inference with the Python Package sbijax

Abstract page for arXiv paper 2409.19435: Simulation-based Inference with the Python Package sbijax

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2603.19994] Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts

Abstract page for arXiv paper 2603.19994: Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset D...

arXiv - Machine Learning · 3 min · 7 days ago

Ai Infrastructure

[2603.19949] TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)

Abstract page for arXiv paper 2603.19949: TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2603.19375] Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Abstract page for arXiv paper 2603.19375: Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

arXiv - Machine Learning · 3 min · 7 days ago

Previous Page 22 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

UMKC Announces New Master of Science in Artificial Intelligence

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

All Content

[2603.20285] AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwidth Collapse

[2603.20213] AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization

LightRest Ltd's 'LAGK' Initiative - Leverage-Aware Governance Kernal

Nvidia CEO Jensen Huang says ‘I think we’ve achieved AGI’ | The Verge

Sam Altman-backed fusion startup Helion in talks to sell power to OpenAI | TechCrunch

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way | TechCrunch

Jensen Huang compares not using AI to using "paper and pencil" to design chips, as he explains Nvidia's massive token budget

Sam Altman-backed fusion startup Helion in talks with OpenAI | TechCrunch

[P] no-magic: 47 AI/ML algorithms implemented from scratch in single-file, zero-dependency Python

[D] The "serverless GPU" market is getting crowded — a breakdown of how different platforms actually differ

3 Questions: How AI could optimize the power grid

[2511.17885] FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and Routing-Aware Token Pruning

[2409.06271] A new paradigm for global sensitivity analysis

[2405.01425] In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies

[2603.18464] AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-Language-Action Models

[2602.10014] A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula

[2409.19435] Simulation-based Inference with the Python Package sbijax

[2603.19994] Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset Distribution Shifts

[2603.19949] TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)

[2603.19375] Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Related Topics

Stay updated with AI News