AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min · about 13 hours ago

All Content

Llms

[2602.22474] When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

This article presents a framework for uncertainty-aware policy steering in robotics, enabling adaptive robot behavior by addressing task ...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22469] Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

This paper introduces Spatial Credit Redistribution (SCR) to address hallucinations in vision-language models by redistributing activatio...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22437] veScale-FSDP: Flexible and High-Performance FSDP at Scale

The paper introduces veScale-FSDP, a new system for Fully Sharded Data Parallel (FSDP) that enhances flexibility and performance for larg...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22434] GetBatch: Distributed Multi-Object Retrieval for ML Data Loading

GetBatch introduces a new object store API that enhances batch retrieval in machine learning data loading, achieving significant performa...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Infrastructure

[2602.22300] Testable Learning of General Halfspaces under Massart Noise

This paper presents a novel algorithm for testably learning general Massart halfspaces under Gaussian noise, achieving near-optimal error...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22402] Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents

The paper presents Contextual Memory Virtualisation (CMV), a novel system for managing state in large language models (LLMs) using a Dire...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22352] GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

The paper presents GRAU, a Generic Reconfigurable Activation Unit designed for neural network hardware accelerators, which significantly ...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.22347] Enabling clinical use of foundation models in histopathology

This article discusses the application of foundation models in histopathology, highlighting a novel approach that improves robustness and...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22230] An Adaptive Multichain Blockchain: A Multiobjective Optimization Approach

This paper presents a novel adaptive multichain blockchain model that addresses scalability issues by employing a multiobjective optimiza...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging

This paper presents a novel approach to reconstruct audio and images from clipped measurements using self-supervised learning, addressing...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.23358] A Dataset is Worth 1 MB

The paper presents PLADA, a novel method for efficient dataset transmission in machine learning, significantly reducing payload size whil...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.23349] FlashOptim: Optimizers for Memory Efficient Training

FlashOptim introduces innovative optimizers that significantly reduce memory usage in neural network training, enhancing efficiency witho...

arXiv - AI · 4 min · about 1 month ago

Ai Infrastructure

[2602.23320] ParamMem: Augmenting Language Agents with Parametric Reflective Memory

The paper introduces ParamMem, a parametric memory module designed to enhance language agents by enabling diverse reflective outputs, imp...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22238] TT-SEAL: TTD-Aware Selective Encryption for Adversarially-Robust and Low-Latency Edge AI

The paper presents TT-SEAL, a selective encryption framework designed for Tensor-Train Decomposed (TTD) networks, enhancing security and ...

arXiv - AI · 3 min · about 1 month ago

Nlp

[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

This paper presents a novel approach to disaster recovery in distributed storage systems, addressing the limitations of cryptographic has...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.23200] InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models

InnerQ presents a novel hardware-aware quantization method for key-value caches in large language models, enhancing decoding efficiency w...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22231] FM-RME: Foundation Model Empowered Radio Map Estimation

The paper presents FM-RME, a foundation model for radio map estimation that integrates self-supervised learning and physical propagation ...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22225] SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG

The paper presents SmartChunk Retrieval, a query-aware framework that enhances retrieval-augmented generation (RAG) by adapting chunk siz...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

DS SERVE is a framework designed to enhance neural retrieval systems by efficiently processing large-scale text datasets, achieving low l...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.23146] Partial recovery of meter-scale surface weather

The paper discusses a method for recovering meter-scale surface weather data by integrating sparse surface measurements with high-resolut...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 71 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

Your prompts aren’t the problem — something else is

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

All Content

[2602.22474] When to Act, Ask, or Learn: Uncertainty-Aware Policy Steering

[2602.22469] Beyond Dominant Patches: Spatial Credit Redistribution For Grounded Vision-Language Models

[2602.22437] veScale-FSDP: Flexible and High-Performance FSDP at Scale

[2602.22434] GetBatch: Distributed Multi-Object Retrieval for ML Data Loading

[2602.22300] Testable Learning of General Halfspaces under Massart Noise

[2602.22402] Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents

[2602.22352] GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

[2602.22347] Enabling clinical use of foundation models in histopathology

[2602.22230] An Adaptive Multichain Blockchain: A Multiobjective Optimization Approach

[2602.22279] Learning to reconstruct from saturated data: audio declipping and high-dynamic range imaging

[2602.23358] A Dataset is Worth 1 MB

[2602.23349] FlashOptim: Optimizers for Memory Efficient Training

[2602.23320] ParamMem: Augmenting Language Agents with Parametric Reflective Memory

[2602.22238] TT-SEAL: TTD-Aware Selective Encryption for Adversarially-Robust and Low-Latency Edge AI

[2602.22237] Optimized Disaster Recovery for Distributed Storage Systems: Lightweight Metadata Architectures to Overcome Cryptographic Hashing Bottleneck

[2602.23200] InnerQ: Hardware-aware Tuning-free Quantization of KV Cache for Large Language Models

[2602.22231] FM-RME: Foundation Model Empowered Radio Map Estimation

[2602.22225] SmartChunk Retrieval: Query-Aware Chunk Compression with Planning for Efficient Document RAG

[2602.22224] DS SERVE: A Framework for Efficient and Scalable Neural Retrieval

[2602.23146] Partial recovery of meter-scale surface weather

Related Topics

Stay updated with AI News