AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

Free 1 year Nvidia api key

NVIDIA limited-time perk: Claim a free 1-year API Key! Hermes Agent now supports integration with the NVIDIA NIM platform, with real-worl...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

Compile English function descriptions into 22 MB neural programs that run locally [P]

We built a system, ProgramAsWeights (PAW), where a neural compiler takes a plain-English function description and produces a "neural prog...

Reddit - Machine Learning · 1 min · about 2 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 4 hours ago

All Content

Ai Infrastructure

[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management

This article presents CASCA, an open-source microservice-based platform designed to enhance sustainable SLO fulfillment and service manag...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2602.12851] Chimera: Neuro-Symbolic Attention Primitives for Trustworthy Dataplane Intelligence

The paper presents Chimera, a framework that integrates neuro-symbolic attention mechanisms into programmable dataplanes, enhancing traff...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.12828] GRAIL: Geometry-Aware Retrieval-Augmented Inference with LLMs over Hyperbolic Representations of Patient Trajectories

The GRAIL framework enhances next-visit event prediction in healthcare by utilizing geometry-aware retrieval and hyperbolic representatio...

arXiv - Machine Learning · 3 min · 2 months ago

Machine Learning

[2602.12798] Can Neural Networks Provide Latent Embeddings for Telemetry-Aware Greedy Routing?

The paper explores a novel algorithm, Placer, which utilizes Message Passing Networks to create latent embeddings for telemetry-aware gre...

arXiv - Machine Learning · 3 min · 2 months ago

Computer Vision

[2602.12758] VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction

The paper presents VineetVC, an adaptive video conferencing system designed to function effectively under severe bandwidth constraints by...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2602.12675] SLA2: Sparse-Linear Attention with Learnable Routing and QAT

The paper presents SLA2, an advanced Sparse-Linear Attention model that enhances video generation efficiency by introducing a learnable r...

arXiv - Machine Learning · 3 min · 2 months ago

Ai Infrastructure

[2602.12649] Multi-Task Learning with Additive U-Net for Image Denoising and Classification

This article presents the Additive U-Net architecture for image denoising and classification, highlighting its advantages in multi-task l...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.12642] Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

This article presents a novel approach to reinforcement learning by reinterpreting the partition function as a difficulty scheduler, enha...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.12641] Artic: AI-oriented Real-time Communication for MLLM Video Assistant

The paper presents Artic, an AI-oriented real-time communication framework designed for Multimodal Large Language Model (MLLM) video assi...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.12635] Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats

This article evaluates HiFloat formats for low-bit inference on Ascend NPUs, highlighting their efficiency and compatibility with state-o...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.12630] TensorCommitments: A Lightweight Verifiable Inference for Language Models

The paper introduces TensorCommitments, a novel proof-of-inference scheme designed to enhance the security of large language model (LLM) ...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.12609] QuEPT: Quantized Elastic Precision Transformers with One-Shot Calibration for Multi-Bit Switching

The paper presents QuEPT, a novel quantization method for Transformers that enables efficient multi-bit switching with one-shot calibrati...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2602.12592] Power Interpretable Causal ODE Networks: A Unified Model for Explainable Anomaly Detection and Root Cause Analysis in Power Systems

The paper presents Power Interpretable Causal ODE Networks (PICODE), a novel model for explainable anomaly detection and root cause analy...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2602.12556] SD-MoE: Spectral Decomposition for Effective Expert Specialization

The paper introduces SD-MoE, a method to enhance expert specialization in Mixture-of-Experts architectures by utilizing spectral decompos...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.12546] Decoder-only Conformer with Modality-aware Sparse Mixtures of Experts for ASR

The paper presents a decoder-only Conformer model for automatic speech recognition (ASR) that integrates speech and text processing witho...

arXiv - AI · 3 min · 2 months ago

Machine Learning

[2602.12542] Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference

The paper presents ExtraCare, a novel domain adaptation method for predictive healthcare that enhances accuracy and transparency by decom...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.12422] CacheMind: From Miss Rates to Why -- Natural-Language, Trace-Grounded Reasoning for Cache Replacement

CacheMind introduces a novel tool for cache replacement, leveraging natural language processing and trace-grounded reasoning to enhance C...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.12393] Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models

This article presents a reproducibility study of DragDiffusion, a method for interactive point-based image editing using diffusion models...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.12322] ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

The paper presents ForeAct, a novel Visual Foresight Planning framework that enhances Vision-Language-Action (VLA) models by enabling the...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.12317] Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement

The paper presents RaSD, a framework for pre-training medical image foundation models using synthetic data, demonstrating superior perfor...

arXiv - Machine Learning · 4 min · 2 months ago

Previous Page 181 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

Free 1 year Nvidia api key

Compile English function descriptions into 22 MB neural programs that run locally [P]

UMKC Announces New Master of Science in Artificial Intelligence

All Content

[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management

[2602.12851] Chimera: Neuro-Symbolic Attention Primitives for Trustworthy Dataplane Intelligence

[2602.12828] GRAIL: Geometry-Aware Retrieval-Augmented Inference with LLMs over Hyperbolic Representations of Patient Trajectories

[2602.12798] Can Neural Networks Provide Latent Embeddings for Telemetry-Aware Greedy Routing?

[2602.12758] VineetVC: Adaptive Video Conferencing Under Severe Bandwidth Constraints Using Audio-Driven Talking-Head Reconstruction

[2602.12675] SLA2: Sparse-Linear Attention with Learnable Routing and QAT

[2602.12649] Multi-Task Learning with Additive U-Net for Image Denoising and Classification

[2602.12642] Beyond Normalization: Rethinking the Partition Function as a Difficulty Scheduler for RLVR

[2602.12641] Artic: AI-oriented Real-time Communication for MLLM Video Assistant

[2602.12635] Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats

[2602.12630] TensorCommitments: A Lightweight Verifiable Inference for Language Models

[2602.12609] QuEPT: Quantized Elastic Precision Transformers with One-Shot Calibration for Multi-Bit Switching

[2602.12592] Power Interpretable Causal ODE Networks: A Unified Model for Explainable Anomaly Detection and Root Cause Analysis in Power Systems

[2602.12556] SD-MoE: Spectral Decomposition for Effective Expert Specialization

[2602.12546] Decoder-only Conformer with Modality-aware Sparse Mixtures of Experts for ASR

[2602.12542] Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference

[2602.12422] CacheMind: From Miss Rates to Why -- Natural-Language, Trace-Grounded Reasoning for Cache Replacement

[2602.12393] Reproducing DragDiffusion: Interactive Point-Based Editing with Diffusion Models

[2602.12322] ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

[2602.12317] Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement

Related Topics

Stay updated with AI News