AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

mining hardware doing AI training - is the output actually useful

there's this network that launched recently routing crypto mining hardware toward AI training workloads. miners seem happy with the econo...

Reddit - Artificial Intelligence · 1 min ·
[2604.01989] Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation
Llms

[2604.01989] Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation

Abstract page for arXiv paper 2604.01989: Attention at Rest Stays at Rest: Breaking Visual Inertia for Cognitive Hallucination Mitigation

arXiv - AI · 4 min ·
[2512.18809] FedVideoMAE: Efficient Privacy-Preserving Federated Video Moderation
Machine Learning

[2512.18809] FedVideoMAE: Efficient Privacy-Preserving Federated Video Moderation

Abstract page for arXiv paper 2512.18809: FedVideoMAE: Efficient Privacy-Preserving Federated Video Moderation

arXiv - AI · 4 min ·

All Content

[2602.05495] Transport and Merge: Cross-Architecture Merging for Large Language Models
Llms

[2602.05495] Transport and Merge: Cross-Architecture Merging for Large Language Models

This paper presents a novel framework for cross-architecture merging of large language models (LLMs), enabling knowledge transfer from hi...

arXiv - AI · 3 min ·
[2601.04205] STaRR: Spatial-Temporal Token-Dynamics-Aware Responsive Remasking for Diffusion Language Models
Llms

[2601.04205] STaRR: Spatial-Temporal Token-Dynamics-Aware Responsive Remasking for Diffusion Language Models

The paper presents STaRR, a novel framework for responsive remasking in diffusion language models that adapts remasking decisions based o...

arXiv - AI · 3 min ·
[2512.02700] VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm
Llms

[2512.02700] VLM-Pruner: Buffering for Spatial Sparsity in an Efficient VLM Centrifugal Token Pruning Paradigm

The paper presents VLM-Pruner, a novel token pruning algorithm designed to enhance the efficiency of vision-language models (VLMs) by bal...

arXiv - Machine Learning · 4 min ·
[2511.07399] StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation
Machine Learning

[2511.07399] StreamDiffusionV2: A Streaming System for Dynamic and Interactive Video Generation

StreamDiffusionV2 presents a novel system for dynamic and interactive video generation, enhancing live streaming capabilities through opt...

arXiv - Machine Learning · 4 min ·
[2510.22221] HPC-Driven Modeling with ML-Based Surrogates for Magnon-Photon Dynamics in Hybrid Quantum Systems
Machine Learning

[2510.22221] HPC-Driven Modeling with ML-Based Surrogates for Magnon-Photon Dynamics in Hybrid Quantum Systems

This article presents a GPU-based simulation framework for modeling magnon-photon dynamics in hybrid quantum systems, utilizing machine l...

arXiv - Machine Learning · 3 min ·
[2510.06820] Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking
Machine Learning

[2510.06820] Efficient Discriminative Joint Encoders for Large Scale Vision-Language Reranking

The paper presents EDJE, an Efficient Discriminative Joint Encoder designed to enhance vision-language reranking by precomputing visual t...

arXiv - Machine Learning · 3 min ·
[2509.16650] Safe and Near-Optimal Control with Online Dynamics Learning
Ai Infrastructure

[2509.16650] Safe and Near-Optimal Control with Online Dynamics Learning

This article presents a novel approach to safe and near-optimal control in dynamic environments, utilizing online dynamics learning to en...

arXiv - Machine Learning · 4 min ·
[2508.19073] CARMA: Collocation-Aware Resource Manager
Machine Learning

[2508.19073] CARMA: Collocation-Aware Resource Manager

CARMA is a collocation-aware resource manager designed to optimize GPU utilization for deep learning workloads while mitigating risks of ...

arXiv - Machine Learning · 4 min ·
[2510.09312] Verifying Chain-of-Thought Reasoning via Its Computational Graph
Machine Learning

[2510.09312] Verifying Chain-of-Thought Reasoning via Its Computational Graph

The paper presents a novel method for verifying Chain-of-Thought (CoT) reasoning in AI models using Circuit-based Reasoning Verification ...

arXiv - Machine Learning · 4 min ·
[2506.10572] Probability Bounding: Post-Hoc Calibration via Box-Constrained Softmax
Machine Learning

[2506.10572] Probability Bounding: Post-Hoc Calibration via Box-Constrained Softmax

The paper introduces Probability Bounding (PB), a novel post-hoc calibration method that uses Box-Constrained Softmax to improve the cali...

arXiv - Machine Learning · 3 min ·
[2506.01928] Esoteric Language Models: Bridging Autoregressive and Masked Diffusion LLMs
Llms

[2506.01928] Esoteric Language Models: Bridging Autoregressive and Masked Diffusion LLMs

The paper introduces Eso-LMs, a novel language model that integrates autoregressive and masked diffusion paradigms, enhancing inference e...

arXiv - Machine Learning · 4 min ·
[2505.17779] U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding
Llms

[2505.17779] U2-BENCH: Benchmarking Large Vision-Language Models on Ultrasound Understanding

The paper introduces U2-BENCH, a benchmark for evaluating large vision-language models (LVLMs) on ultrasound understanding, addressing ch...

arXiv - Machine Learning · 4 min ·
[2501.06336] MEt3R: Measuring Multi-View Consistency in Generated Images
Machine Learning

[2501.06336] MEt3R: Measuring Multi-View Consistency in Generated Images

The paper presents MEt3R, a novel metric for assessing multi-view consistency in generated images, addressing limitations of traditional ...

arXiv - Machine Learning · 4 min ·
[2501.00339] GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression
Llms

[2501.00339] GRASP: Replace Redundant Layers with Adaptive Singular Parameters for Efficient Model Compression

The paper introduces GRASP, a novel framework for model compression that replaces redundant layers in large language models with adaptive...

arXiv - Machine Learning · 4 min ·
[2509.10544] ASL360: AI-Enabled Adaptive Streaming of Layered 360$^\circ$ Video over UAV-assisted Wireless Networks
Machine Learning

[2509.10544] ASL360: AI-Enabled Adaptive Streaming of Layered 360$^\circ$ Video over UAV-assisted Wireless Networks

The paper presents ASL360, an AI-based system for adaptive streaming of layered 360° video over UAV-assisted wireless networks, enhancing...

arXiv - AI · 4 min ·
[2509.06326] AttestLLM: Efficient Attestation Framework for Billion-scale On-device LLMs
Llms

[2509.06326] AttestLLM: Efficient Attestation Framework for Billion-scale On-device LLMs

The paper presents AttestLLM, a novel framework for efficiently attesting billion-scale on-device LLMs, ensuring model legitimacy and pro...

arXiv - AI · 3 min ·
[2410.02099] A Watermark for Black-Box Language Models
Llms

[2410.02099] A Watermark for Black-Box Language Models

The paper presents a novel watermarking scheme for black-box language models, enabling detection of model outputs without requiring white...

arXiv - Machine Learning · 3 min ·
[2508.07087] SQL-Exchange: Transforming SQL Queries Across Domains
Ai Infrastructure

[2508.07087] SQL-Exchange: Transforming SQL Queries Across Domains

SQL-Exchange introduces a framework for transforming SQL queries across different database schemas while maintaining structural integrity...

arXiv - AI · 3 min ·
[2508.00017] Generative Logic: A New Computer Architecture for Deterministic Reasoning and Knowledge Generation
Machine Learning

[2508.00017] Generative Logic: A New Computer Architecture for Deterministic Reasoning and Knowledge Generation

The paper introduces Generative Logic (GL), a new computer architecture designed for deterministic reasoning and knowledge generation, ut...

arXiv - AI · 4 min ·
[2507.23465] Role-Aware Language Models for Secure and Contextualized Access Control in Organizations
Llms

[2507.23465] Role-Aware Language Models for Secure and Contextualized Access Control in Organizations

This article explores the development of role-aware language models designed to enhance access control in organizational settings, focusi...

arXiv - AI · 3 min ·
Previous Page 90 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime