AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 7 hours ago

Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min · about 11 hours ago

Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min · about 16 hours ago

All Content

Machine Learning

[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion

The paper introduces DP-aware AdaLN-Zero, a novel mechanism to mitigate heavy-tailed gradients in differentially private diffusion models...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22592] pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training

The paper presents pQuant, a novel approach for low-bit language models that utilizes decoupled linear quantization-aware training to enh...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.22718] RLHFless: Serverless Computing for Efficient RLHF

The paper introduces RLHFless, a serverless computing framework designed to enhance the efficiency of Reinforcement Learning from Human F...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22575] S2O: Early Stopping for Sparse Attention via Online Permutation

The paper presents S2O, a novel approach for early stopping in sparse attention mechanisms, enhancing efficiency in long-context inferenc...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics

The paper introduces 'Knob', a physics-inspired framework that enhances neural network calibration by allowing dynamic adjustments to mod...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22560] Operationalizing Fairness: Post-Hoc Threshold Optimization Under Hard Resource Limits

This paper presents a framework for optimizing decision thresholds in machine learning to balance fairness and resource constraints, ensu...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22538] RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format

The paper presents RAIN-Merging, a gradient-free method designed to enhance instruction adherence in large reasoning models while preserv...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

This paper explores the concept of strategy executability in mathematical reasoning, highlighting the differences between human and model...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

This paper presents an agentic AI framework for optimizing intent-driven operations in cell-free O-RAN, enhancing collaboration among age...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22495] Reinforcement-aware Knowledge Distillation for LLM Reasoning

The paper presents Reinforcement-aware Knowledge Distillation (RLAD) for enhancing reasoning in large language models (LLMs) by addressin...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22428] Calibrated Test-Time Guidance for Bayesian Inference

This paper introduces a method for calibrated test-time guidance in Bayesian inference, addressing issues with existing approaches that m...

arXiv - AI · 3 min · about 1 month ago

Generative Ai

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

ArchAgent is an AI-driven system that automates computer architecture discovery, achieving significant performance improvements in cache ...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22408] Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

This article presents the Cognitive Abstraction and Reasoning Corpus (CogARC), a study exploring human abstract reasoning through problem...

arXiv - AI · 4 min · about 1 month ago

Nlp

[2602.22302] Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

The paper presents Agent Behavioral Contracts (ABC), a framework for specifying and enforcing the behavior of autonomous AI agents, addre...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22367] Learning geometry-dependent lead-field operators for forward ECG modeling

This article presents a novel approach to forward electrocardiogram (ECG) modeling using geometry-dependent lead-field operators, enhanci...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22294] When Should a Model Change Its Mind? An Energy-Based Theory and Regularizer for Concept Drift in Electrocardiogram (ECG) Signals

This paper presents an energy-based framework for managing concept drift in ECG signals, proposing a new regularizer that enhances model ...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22286] OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data

OmniZip introduces a unified and lightweight lossless compressor designed for multi-modal data, enhancing compression efficiency across v...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22277] X-REFINE: XAI-based RElevance input-Filtering and archItecture fiNe-tuning for channel Estimation

The paper presents X-REFINE, an XAI-based framework for optimizing channel estimation in 6G wireless communications by combining input fi...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

The paper presents Clustered Quantum Secure Aggregation (CQSA), a novel framework for Byzantine-robust secure aggregation in federated le...

arXiv - Machine Learning · 4 min · about 1 month ago

Llms

[2602.22268] AutoQRA: Joint Optimization of Mixed-Precision Quantization and Low-rank Adapters for Efficient LLM Fine-Tuning

The paper presents AutoQRA, a framework that optimizes mixed-precision quantization and low-rank adapters for efficient fine-tuning of la...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 73 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

Your prompts aren’t the problem — something else is

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

All Content

[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion

[2602.22592] pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training

[2602.22718] RLHFless: Serverless Computing for Efficient RLHF

[2602.22575] S2O: Early Stopping for Sparse Attention via Online Permutation

[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics

[2602.22560] Operationalizing Fairness: Post-Hoc Threshold Optimization Under Hard Resource Limits

[2602.22538] RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

[2602.22495] Reinforcement-aware Knowledge Distillation for LLM Reasoning

[2602.22428] Calibrated Test-Time Guidance for Bayesian Inference

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

[2602.22408] Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

[2602.22302] Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

[2602.22367] Learning geometry-dependent lead-field operators for forward ECG modeling

[2602.22294] When Should a Model Change Its Mind? An Energy-Based Theory and Regularizer for Concept Drift in Electrocardiogram (ECG) Signals

[2602.22286] OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data

[2602.22277] X-REFINE: XAI-based RElevance input-Filtering and archItecture fiNe-tuning for channel Estimation

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

[2602.22268] AutoQRA: Joint Optimization of Mixed-Precision Quantization and Low-rank Adapters for Efficient LLM Fine-Tuning

Related Topics

Stay updated with AI News