[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX
New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...
GPUs, training clusters, MLOps, and deployment
New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...
UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...
Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation
Abstract page for arXiv paper 2603.20285: AgentComm-Bench: Stress-Testing Cooperative Embodied AI Under Latency, Packet Loss, and Bandwid...
Abstract page for arXiv paper 2603.20213: AgenticGEO: A Self-Evolving Agentic System for Generative Engine Optimization
Most discussions around AI safety focus on what models know or whether outputs are correct. But since 2019, I’ve been working on somethin...
He then seemed to slightly walk back the claim.
OpenAI CEO Sam Altman is stepping down as board chair of Helion. His departure comes as reports that the two companies are negotiating a ...
Gimlet Labs just raised an $80 million Series A for tech that lets AI run across NVIDIA, AMD, Intel, ARM, Cerebras and d-Matrix chips, si...
submitted by /u/Tiny-Independent273 [link] [comments]
Helion is reportedly negotiating a deal that would see it sell 12.5% of its power output to OpenAI.
I've been building no-magic — a collection of 47 single-file Python implementations of the algorithms behind modern AI. No PyTorch, no Te...
ok so I’ve been going down a rabbit hole on this for the past few weeks for a piece I’m writing and honestly the amount of marketing BS i...
MIT researchers explore how AI can optimize the power grid, enhancing efficiency, resilience against extreme weather, and supporting rene...
Abstract page for arXiv paper 2511.17885: FastMMoE: Accelerating Multimodal Large Language Models through Dynamic Expert Activation and R...
Abstract page for arXiv paper 2409.06271: A new paradigm for global sensitivity analysis
Abstract page for arXiv paper 2405.01425: In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies
Abstract page for arXiv paper 2603.18464: AcceRL: A Distributed Asynchronous Reinforcement Learning and World Model Framework for Vision-...
Abstract page for arXiv paper 2602.10014: A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula
Abstract page for arXiv paper 2409.19435: Simulation-based Inference with the Python Package sbijax
Abstract page for arXiv paper 2603.19994: Evaluating Test-Time Adaptation For Facial Expression Recognition Under Natural Cross-Dataset D...
Abstract page for arXiv paper 2603.19949: TAPAS: Efficient Two-Server Asymmetric Private Aggregation Beyond Prio(+)
Abstract page for arXiv paper 2603.19375: Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents
Get the latest news, tools, and insights delivered to your inbox.
Daily or weekly digest • Unsubscribe anytime