AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min ·

All Content

[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion
Machine Learning

[2602.22610] DP-aware AdaLN-Zero: Taming Conditioning-Induced Heavy-Tailed Gradients in Differentially Private Diffusion

The paper introduces DP-aware AdaLN-Zero, a novel mechanism to mitigate heavy-tailed gradients in differentially private diffusion models...

arXiv - Machine Learning · 4 min ·
[2602.22592] pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training
Llms

[2602.22592] pQuant: Towards Effective Low-Bit Language Models via Decoupled Linear Quantization-Aware Training

The paper presents pQuant, a novel approach for low-bit language models that utilizes decoupled linear quantization-aware training to enh...

arXiv - Machine Learning · 3 min ·
[2602.22718] RLHFless: Serverless Computing for Efficient RLHF
Llms

[2602.22718] RLHFless: Serverless Computing for Efficient RLHF

The paper introduces RLHFless, a serverless computing framework designed to enhance the efficiency of Reinforcement Learning from Human F...

arXiv - AI · 4 min ·
[2602.22575] S2O: Early Stopping for Sparse Attention via Online Permutation
Machine Learning

[2602.22575] S2O: Early Stopping for Sparse Attention via Online Permutation

The paper presents S2O, a novel approach for early stopping in sparse attention mechanisms, enhancing efficiency in long-context inferenc...

arXiv - AI · 4 min ·
[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics
Machine Learning

[2602.22702] Knob: A Physics-Inspired Gating Interface for Interpretable and Controllable Neural Dynamics

The paper introduces 'Knob', a physics-inspired framework that enhances neural network calibration by allowing dynamic adjustments to mod...

arXiv - AI · 4 min ·
[2602.22560] Operationalizing Fairness: Post-Hoc Threshold Optimization Under Hard Resource Limits
Machine Learning

[2602.22560] Operationalizing Fairness: Post-Hoc Threshold Optimization Under Hard Resource Limits

This paper presents a framework for optimizing decision thresholds in machine learning to balance fairness and resource constraints, ensu...

arXiv - AI · 4 min ·
[2602.22538] RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format
Machine Learning

[2602.22538] RAIN-Merging: A Gradient-Free Method to Enhance Instruction Following in Large Reasoning Models with Preserved Thinking Format

The paper presents RAIN-Merging, a gradient-free method designed to enhance instruction adherence in large reasoning models while preserv...

arXiv - Machine Learning · 4 min ·
[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance
Machine Learning

[2602.22583] Strategy Executability in Mathematical Reasoning: Leveraging Human-Model Differences for Effective Guidance

This paper explores the concept of strategy executability in mathematical reasoning, highlighting the differences between human and model...

arXiv - AI · 4 min ·
[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN
Llms

[2602.22539] Agentic AI for Intent-driven Optimization in Cell-free O-RAN

This paper presents an agentic AI framework for optimizing intent-driven operations in cell-free O-RAN, enhancing collaboration among age...

arXiv - AI · 4 min ·
[2602.22495] Reinforcement-aware Knowledge Distillation for LLM Reasoning
Llms

[2602.22495] Reinforcement-aware Knowledge Distillation for LLM Reasoning

The paper presents Reinforcement-aware Knowledge Distillation (RLAD) for enhancing reasoning in large language models (LLMs) by addressin...

arXiv - AI · 4 min ·
[2602.22428] Calibrated Test-Time Guidance for Bayesian Inference
Machine Learning

[2602.22428] Calibrated Test-Time Guidance for Bayesian Inference

This paper introduces a method for calibrated test-time guidance in Bayesian inference, addressing issues with existing approaches that m...

arXiv - AI · 3 min ·
[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery
Generative Ai

[2602.22425] ArchAgent: Agentic AI-driven Computer Architecture Discovery

ArchAgent is an AI-driven system that automates computer architecture discovery, achieving significant performance improvements in cache ...

arXiv - AI · 4 min ·
[2602.22408] Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus
Machine Learning

[2602.22408] Exploring Human Behavior During Abstract Rule Inference and Problem Solving with the Cognitive Abstraction and Reasoning Corpus

This article presents the Cognitive Abstraction and Reasoning Corpus (CogARC), a study exploring human abstract reasoning through problem...

arXiv - AI · 4 min ·
[2602.22302] Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents
Nlp

[2602.22302] Agent Behavioral Contracts: Formal Specification and Runtime Enforcement for Reliable Autonomous AI Agents

The paper presents Agent Behavioral Contracts (ABC), a framework for specifying and enforcing the behavior of autonomous AI agents, addre...

arXiv - AI · 4 min ·
[2602.22367] Learning geometry-dependent lead-field operators for forward ECG modeling
Machine Learning

[2602.22367] Learning geometry-dependent lead-field operators for forward ECG modeling

This article presents a novel approach to forward electrocardiogram (ECG) modeling using geometry-dependent lead-field operators, enhanci...

arXiv - AI · 4 min ·
[2602.22294] When Should a Model Change Its Mind? An Energy-Based Theory and Regularizer for Concept Drift in Electrocardiogram (ECG) Signals
Machine Learning

[2602.22294] When Should a Model Change Its Mind? An Energy-Based Theory and Regularizer for Concept Drift in Electrocardiogram (ECG) Signals

This paper presents an energy-based framework for managing concept drift in ECG signals, proposing a new regularizer that enhances model ...

arXiv - Machine Learning · 4 min ·
[2602.22286] OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data
Llms

[2602.22286] OmniZip: Learning a Unified and Lightweight Lossless Compressor for Multi-Modal Data

OmniZip introduces a unified and lightweight lossless compressor designed for multi-modal data, enhancing compression efficiency across v...

arXiv - Machine Learning · 4 min ·
[2602.22277] X-REFINE: XAI-based RElevance input-Filtering and archItecture fiNe-tuning for channel Estimation
Machine Learning

[2602.22277] X-REFINE: XAI-based RElevance input-Filtering and archItecture fiNe-tuning for channel Estimation

The paper presents X-REFINE, an XAI-based framework for optimizing channel estimation in 6G wireless communications by combining input fi...

arXiv - Machine Learning · 3 min ·
[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning
Machine Learning

[2602.22269] CQSA: Byzantine-robust Clustered Quantum Secure Aggregation in Federated Learning

The paper presents Clustered Quantum Secure Aggregation (CQSA), a novel framework for Byzantine-robust secure aggregation in federated le...

arXiv - Machine Learning · 4 min ·
[2602.22268] AutoQRA: Joint Optimization of Mixed-Precision Quantization and Low-rank Adapters for Efficient LLM Fine-Tuning
Llms

[2602.22268] AutoQRA: Joint Optimization of Mixed-Precision Quantization and Low-rank Adapters for Efficient LLM Fine-Tuning

The paper presents AutoQRA, a framework that optimizes mixed-precision quantization and low-rank adapters for efficient fine-tuning of la...

arXiv - Machine Learning · 4 min ·
Previous Page 73 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime