AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

Is "live AI video generation" a meaningful technical category or just a marketing term? [R]

Asking from a technical standpoint because I feel like the term is doing a lot of work in coverage of this space right now. Genuine real-...

Reddit - Machine Learning · 1 min ·
Ai Infrastructure

FlashAttention (FA1–FA4) in PyTorch - educational implementations focused on algorithmic differences [P]

I recently updated my FlashAttention-PyTorch repo so it now includes educational implementations of FA1, FA2, FA3, and FA4 in plain PyTor...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·

All Content

[2410.11855] Online GPU Energy Optimization with Switching-Aware Bandits
Machine Learning

[2410.11855] Online GPU Energy Optimization with Switching-Aware Bandits

This paper presents EnergyUCB, a novel online GPU energy optimization method using a multi-armed bandit approach to balance performance a...

arXiv - Machine Learning · 4 min ·
[2602.15564] Beyond Static Pipelines: Learning Dynamic Workflows for Text-to-SQL
Machine Learning

[2602.15564] Beyond Static Pipelines: Learning Dynamic Workflows for Text-to-SQL

The paper presents a novel approach to Text-to-SQL systems by introducing dynamic workflows that adapt during inference, enhancing perfor...

arXiv - AI · 3 min ·
[2405.20178] Non-intrusive data-driven model order reduction for circuits based on Hammerstein architectures
Machine Learning

[2405.20178] Non-intrusive data-driven model order reduction for circuits based on Hammerstein architectures

This paper presents a non-intrusive data-driven model order reduction method for circuits using Hammerstein architectures, demonstrating ...

arXiv - Machine Learning · 4 min ·
[2602.15549] VLM-DEWM: Dynamic External World Model for Verifiable and Resilient Vision-Language Planning in Manufacturing
Llms

[2602.15549] VLM-DEWM: Dynamic External World Model for Verifiable and Resilient Vision-Language Planning in Manufacturing

The paper introduces VLM-DEWM, a novel cognitive architecture designed to enhance vision-language planning in manufacturing by addressing...

arXiv - AI · 4 min ·
[2602.08032] Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models
Machine Learning

[2602.08032] Horizon Imagination: Efficient On-Policy Rollout in Diffusion World Models

The paper presents Horizon Imagination (HI), an innovative on-policy imagination process for reinforcement learning using diffusion-based...

arXiv - Machine Learning · 3 min ·
[2602.15491] The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs
Nlp

[2602.15491] The Equalizer: Introducing Shape-Gain Decomposition in Neural Audio Codecs

The paper presents Shape-Gain Decomposition for Neural Audio Codecs, enhancing bitrate-distortion performance and reducing complexity by ...

arXiv - AI · 4 min ·
[2602.00240] Green-NAS: A Global-Scale Multi-Objective Neural Architecture Search for Robust and Efficient Edge-Native Weather Forecasting
Machine Learning

[2602.00240] Green-NAS: A Global-Scale Multi-Objective Neural Architecture Search for Robust and Efficient Edge-Native Weather Forecasting

Green-NAS presents a multi-objective neural architecture search framework aimed at optimizing weather forecasting models for low-resource...

arXiv - Machine Learning · 4 min ·
[2602.15377] Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework
Ai Infrastructure

[2602.15377] Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework

This paper presents an orchestration-free framework for customer service automation, utilizing Task-Oriented Flowcharts (TOFs) to enhance...

arXiv - AI · 3 min ·
[2601.01016] Improving Variational Autoencoder using Random Fourier Transformation: An Aviation Safety Anomaly Detection Case-Study
Machine Learning

[2601.01016] Improving Variational Autoencoder using Random Fourier Transformation: An Aviation Safety Anomaly Detection Case-Study

This study explores enhancements to Variational Autoencoders (VAEs) using Random Fourier Transformation (RFT) for anomaly detection in av...

arXiv - Machine Learning · 4 min ·
[2512.04189] BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training
Machine Learning

[2512.04189] BEP: A Binary Error Propagation Algorithm for Binary Neural Networks Training

The paper presents BEP, a novel Binary Error Propagation algorithm for training Binary Neural Networks (BNNs) that enables efficient back...

arXiv - AI · 4 min ·
[2512.01389] Syndrome-Flow Consistency Model Achieves One-step Denoising Error Correction Codes
Machine Learning

[2512.01389] Syndrome-Flow Consistency Model Achieves One-step Denoising Error Correction Codes

The paper presents the Error Correction Syndrome-Flow Consistency Model (ECCFM), which enhances one-step denoising error correction codes...

arXiv - AI · 4 min ·
[2602.15353] NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question Answering
Llms

[2602.15353] NeuroSymActive: Differentiable Neural-Symbolic Reasoning with Active Exploration for Knowledge Graph Question Answering

The paper presents NeuroSymActive, a novel framework for Knowledge Graph Question Answering that integrates differentiable neural-symboli...

arXiv - AI · 3 min ·
[2602.15318] Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs
Llms

[2602.15318] Sparrow: Text-Anchored Window Attention with Visual-Semantic Glimpsing for Speculative Decoding in Video LLMs

The paper introduces Sparrow, a novel framework designed to enhance speculative decoding in Video Large Language Models (Vid-LLMs) by opt...

arXiv - AI · 4 min ·
[2508.11460] Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models
Machine Learning

[2508.11460] Calibrated and uncertain? Evaluating uncertainty estimates in binary classification models

This article evaluates uncertainty estimates in binary classification models, comparing six probabilistic machine learning algorithms to ...

arXiv - Machine Learning · 4 min ·
[2602.15286] AI-Paging: Lease-Based Execution Anchoring for Network-Exposed AI-as-a-Service
Machine Learning

[2602.15286] AI-Paging: Lease-Based Execution Anchoring for Network-Exposed AI-as-a-Service

The paper presents AI-Paging, a framework for optimizing AI-as-a-Service by enabling network providers to manage model selection and exec...

arXiv - AI · 4 min ·
[2602.15281] High-Fidelity Network Management for Federated AI-as-a-Service: Cross-Domain Orchestration
Machine Learning

[2602.15281] High-Fidelity Network Management for Federated AI-as-a-Service: Cross-Domain Orchestration

This paper presents a framework for high-fidelity network management in Federated AI-as-a-Service, focusing on cross-domain orchestration...

arXiv - AI · 4 min ·
[2505.11824] Latent Veracity Inference for Identifying Errors in Stepwise Reasoning
Llms

[2505.11824] Latent Veracity Inference for Identifying Errors in Stepwise Reasoning

This paper presents a novel method for identifying errors in stepwise reasoning using latent veracity inference, enhancing the reliabilit...

arXiv - AI · 4 min ·
[2505.11695] Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization
Machine Learning

[2505.11695] Qronos: Correcting the Past by Shaping the Future... in Post-Training Quantization

The paper introduces Qronos, a novel post-training quantization algorithm that enhances neural network performance by correcting quantiza...

arXiv - AI · 4 min ·
[2602.15249] Artificial Intelligence Specialization in the European Union: Underexplored Role of the Periphery at NUTS-3 Level
Ai Infrastructure

[2602.15249] Artificial Intelligence Specialization in the European Union: Underexplored Role of the Periphery at NUTS-3 Level

This study analyzes AI research production across European regions at the NUTS-3 level, highlighting the specialization of peripheral reg...

arXiv - AI · 4 min ·
[2602.15241] GenAI for Systems: Recurring Challenges and Design Principles from Software to Silicon
Machine Learning

[2602.15241] GenAI for Systems: Recurring Challenges and Design Principles from Software to Silicon

This paper explores the integration of Generative AI in computing systems, identifying recurring challenges and design principles across ...

arXiv - AI · 4 min ·
Previous Page 136 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime