AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min ·

All Content

[2602.22474] When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering
Llms

[2602.22474] When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

This article presents a framework for uncertainty-aware policy steering in robotics, enabling adaptive robot behavior by addressing task ...

arXiv - Machine Learning · 4 min ·
[2602.22469] Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models
Llms

[2602.22469] Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

This paper introduces Spatial Credit Redistribution (SCR) to address hallucinations in vision-language models by redistributing activatio...

arXiv - AI · 4 min ·
[2602.22437] veScale-FSDP: Flexible and High-Performance FSDP at Scale
Llms

[2602.22437] veScale-FSDP: Flexible and High-Performance FSDP at Scale

The paper introduces veScale-FSDP, a new system for Fully Sharded Data Parallel (FSDP) that enhances flexibility and performance for larg...

arXiv - Machine Learning · 3 min ·
[2602.22434] GetBatch: Distributed Multi-Object Retrieval for ML Data Loading
Machine Learning

[2602.22434] GetBatch: Distributed Multi-Object Retrieval for ML Data Loading

GetBatch introduces a new object store API that enhances batch retrieval in machine learning data loading, achieving significant performa...

arXiv - Machine Learning · 3 min ·
[2602.22300] Testable Learning of General Halfspaces under Massart Noise
Ai Infrastructure

[2602.22300] Testable Learning of General Halfspaces under Massart Noise

This paper presents a novel algorithm for testably learning general Massart halfspaces under Gaussian noise, achieving near-optimal error...

arXiv - Machine Learning · 3 min ·
[2602.22402] Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents
Llms

[2602.22402] Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents

The paper presents Contextual Memory Virtualisation (CMV), a novel system for managing state in large language models (LLMs) using a Dire...

arXiv - AI · 4 min ·
[2602.22352] GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators
Machine Learning

[2602.22352] GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

The paper presents GRAU, a Generic Reconfigurable Activation Unit designed for neural network hardware accelerators, which significantly ...

arXiv - AI · 3 min ·
[2602.22347] Enabling clinical use of foundation models in histopathology
Llms

[2602.22347] Enabling clinical use of foundation models in histopathology

This article discusses the application of foundation models in histopathology, highlighting a novel approach that improves robustness and...

arXiv - AI · 4 min ·
[2602.22230] An Adaptive Multichain Blockchain: A Multiobjective Optimization Approach
Machine Learning

[2602.22230] An Adaptive Multichain Blockchain: A Multiobjective Optimization Approach

This paper presents a novel adaptive multichain blockchain model that addresses scalability issues by employing a multiobjective optimiza...

arXiv - Machine Learning · 3 min ·
[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging
Machine Learning

[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging

This paper presents a novel approach to reconstruct audio and images from clipped measurements using self-supervised learning, addressing...

arXiv - AI · 3 min ·
[2602.23358] A Dataset is Worth 1 MB
Machine Learning

[2602.23358] A Dataset is Worth 1 MB

The paper presents PLADA, a novel method for efficient dataset transmission in machine learning, significantly reducing payload size whil...

arXiv - Machine Learning · 4 min ·
[2602.23349] FlashOptim: Optimizers for Memory Efficient Training
Machine Learning

[2602.23349] FlashOptim: Optimizers for Memory Efficient Training

FlashOptim introduces innovative optimizers that significantly reduce memory usage in neural network training, enhancing efficiency witho...

arXiv - AI · 4 min ·
[2602.23320] ParamMem: Augmenting Language Agents with Parametric Reflective Memory
Ai Infrastructure

[2602.23320] ParamMem: Augmenting Language Agents with Parametric Reflective Memory

The paper introduces ParamMem, a parametric memory module designed to enhance language agents by enabling diverse reflective outputs, imp...

arXiv - Machine Learning · 3 min ·
[2602.22238] TT-SEAL: TTD-Aware Selective Encryption for Adversarially-Robust and Low-Latency Edge AI
Machine Learning

[2602.22238] TT-SEAL: TTD-Aware Selective Encryption for Adversarially-Robust and Low-Latency Edge AI

The paper presents TT-SEAL, a selective encryption framework designed for Tensor-Train Decomposed (TTD) networks, enhancing security and ...

arXiv - AI · 3 min ·
[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck
Nlp

[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

This paper presents a novel approach to disaster recovery in distributed storage systems, addressing the limitations of cryptographic has...

arXiv - AI · 3 min ·
[2602.23200] InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models
Llms

[2602.23200] InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models

InnerQ presents a novel hardware-aware quantization method for key-value caches in large language models, enhancing decoding efficiency w...

arXiv - Machine Learning · 4 min ·
[2602.22231] FM-RME: Foundation Model Empowered Radio Map Estimation
Llms

[2602.22231] FM-RME: Foundation Model Empowered Radio Map Estimation

The paper presents FM-RME, a foundation model for radio map estimation that integrates self-supervised learning and physical propagation ...

arXiv - Machine Learning · 3 min ·
[2602.22225] SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG
Llms

[2602.22225] SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG

The paper presents SmartChunk Retrieval, a query-aware framework that enhances retrieval-augmented generation (RAG) by adapting chunk siz...

arXiv - Machine Learning · 4 min ·
[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval
Machine Learning

[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

DS SERVE is a framework designed to enhance neural retrieval systems by efficiently processing large-scale text datasets, achieving low l...

arXiv - AI · 3 min ·
[2602.23146] Partial recovery of meter-scale surface weather
Machine Learning

[2602.23146] Partial recovery of meter-scale surface weather

The paper discusses a method for recovering meter-scale surface weather data by integrating sparse surface measurements with high-resolut...

arXiv - Machine Learning · 4 min ·
Previous Page 71 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime