AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

I'm profoundly ambivalent re: how to feel about this; is it great -- what a scrappy, bold pivot! Or wildly dumb - its so far from their c...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Ai Infrastructure

Allbirds Is Pivoting to AI Compute. Sure, Why Not | WIRED

Once a $4 billion apparel juggernaut, Allbirds will rebrand as NewBird AI, a “GPU-as-a-Service” company. Hey, if you can't beat ’em, join...

Wired - AI · 5 min · about 2 hours ago

Machine Learning

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

So, yesterday run was a success and I did get an avg rollout length of about 64 tokens as attached in the image! This was with quality_re...

Reddit - Machine Learning · 1 min · about 10 hours ago

All Content

Machine Learning

[2509.26335] TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs

The paper discusses TrackCore-F, a methodology for deploying Transformer-based models for subatomic particle tracking on FPGAs, highlight...

arXiv - Machine Learning · 3 min · about 2 months ago

Nlp

[2509.18129] Pareto-optimal Trade-offs Between Communication and Computation with Flexible Gradient Tracking

This paper presents FlexGT, a method for optimizing distributed stochastic problems by balancing communication and computation, achieving...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2511.07293] Formal Reasoning About Confidence and Automated Verification of Neural Networks

This paper presents a framework for formal reasoning about the confidence and robustness of neural networks, proposing a unified techniqu...

arXiv - AI · 3 min · about 2 months ago

Nlp

[2510.22876] Batch Speculative Decoding Done Right

The paper presents a novel framework for batch speculative decoding, addressing critical failures in existing methods and achieving signi...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2506.08749] Superposed parameterised quantum circuits

The paper introduces superposed parameterised quantum circuits, enhancing quantum machine learning by embedding multiple parameter sets i...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2506.05402] Lorica: A Synergistic Fine-Tuning Framework for Advancing Personalized Adversarial Robustness

The paper presents Lorica, a novel framework aimed at enhancing personalized adversarial robustness in machine learning models, particula...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2505.21723] Are Statistical Methods Obsolete in the Era of Deep Learning? A Study of ODE Inverse Problems

This article examines the relevance of statistical methods in the age of deep learning, using ordinary differential equation (ODE) invers...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2510.02356] Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

This article presents EAPrivacy, a benchmark for evaluating the physical-world privacy awareness of large language models (LLMs), reveali...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2509.25275] VoiceBridge: General Speech Restoration with One-step Latent Bridge Models

VoiceBridge introduces a novel one-step latent bridge model for general speech restoration, enhancing audio quality from various distorti...

arXiv - AI · 4 min · about 2 months ago

Llms

[2509.23519] ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

The paper introduces ReliabilityRAG, a framework designed to enhance the robustness of Retrieval-Augmented Generation (RAG) systems again...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2403.15605] Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

The paper presents a novel method, gPerXAN, for Federated Domain Generalization (FedDG) that enhances model performance by effectively as...

arXiv - Machine Learning · 4 min · about 2 months ago

Ai Infrastructure

[2507.19234] Virne: A Comprehensive Benchmark for RL-based Network Resource Allocation in NFV

The paper introduces Virne, a benchmarking framework designed for Reinforcement Learning-based resource allocation in Network Function Vi...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2306.14297] Inference for relative sparsity

The paper discusses a novel approach to inference for relative sparsity in healthcare decision-making, addressing the need for uncertaint...

arXiv - Machine Learning · 4 min · about 2 months ago

Nlp

[2507.14186] A Disentangled Representation Learning Framework for Low-altitude Network Coverage Prediction

This paper presents a novel framework for predicting low-altitude network coverage using disentangled representation learning, addressing...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.12247] ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction

ExtractBench introduces a benchmark and evaluation framework for extracting structured data from unstructured documents like PDFs, addres...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.06801] On the Non-Identifiability of Steering Vectors in Large Language Models

This paper explores the non-identifiability of steering vectors in large language models (LLMs), revealing that these vectors cannot be u...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.05319] Accelerated Sequential Flow Matching: A Bayesian Filtering Perspective

This paper introduces Accelerated Sequential Flow Matching, a Bayesian filtering framework that enhances real-time inference in stochasti...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.04942] Privileged Information Distillation for Language Models

This paper presents methods for distilling privileged information in language models, focusing on improving performance in multi-turn env...

arXiv - AI · 4 min · about 2 months ago

Llms

[2506.02634] KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider

This paper characterizes and optimizes KVCache, a caching mechanism for large language model (LLM) serving at a major cloud provider, hig...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.03546] How to Train Your Resistive Network: Generalized Equilibrium Propagation and Analytical Learning

This paper presents a novel algorithm for training resistive networks using Generalized Equilibrium Propagation, aiming to enhance energy...

arXiv - Machine Learning · 4 min · about 2 months ago

Previous Page 157 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

Allbirds Is Pivoting to AI Compute. Sure, Why Not | WIRED

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

All Content

[2509.26335] TrackCore-F: Deploying Transformer-Based Subatomic Particle Tracking on FPGAs

[2509.18129] Pareto-optimal Trade-offs Between Communication and Computation with Flexible Gradient Tracking

[2511.07293] Formal Reasoning About Confidence and Automated Verification of Neural Networks

[2510.22876] Batch Speculative Decoding Done Right

[2506.08749] Superposed parameterised quantum circuits

[2506.05402] Lorica: A Synergistic Fine-Tuning Framework for Advancing Personalized Adversarial Robustness

[2505.21723] Are Statistical Methods Obsolete in the Era of Deep Learning? A Study of ODE Inverse Problems

[2510.02356] Measuring Physical-World Privacy Awareness of Large Language Models: An Evaluation Benchmark

[2509.25275] VoiceBridge: General Speech Restoration with One-step Latent Bridge Models

[2509.23519] ReliabilityRAG: Effective and Provably Robust Defense for RAG-based Web-Search

[2403.15605] Efficiently Assemble Normalization Layers and Regularization for Federated Domain Generalization

[2507.19234] Virne: A Comprehensive Benchmark for RL-based Network Resource Allocation in NFV

[2306.14297] Inference for relative sparsity

[2507.14186] A Disentangled Representation Learning Framework for Low-altitude Network Coverage Prediction

[2602.12247] ExtractBench: A Benchmark and Evaluation Methodology for Complex Structured Extraction

[2602.06801] On the Non-Identifiability of Steering Vectors in Large Language Models

[2602.05319] Accelerated Sequential Flow Matching: A Bayesian Filtering Perspective

[2602.04942] Privileged Information Distillation for Language Models

[2506.02634] KVCache Cache in the Wild: Characterizing and Optimizing KVCache Cache at a Large Cloud Provider

[2602.03546] How to Train Your Resistive Network: Generalized Equilibrium Propagation and Analytical Learning

Related Topics

Stay updated with AI News