AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Llms

Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]

Hey guys, I'm an independent researcher working on a project that tries to address a very specific failure mode in LLMs and embedding bas...

Reddit - Machine Learning · 1 min · about 6 hours ago

Machine Learning

What if attention didn’t need matrix multiplication?

I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-po...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

All Content

Llms

[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction

LLMStructBench introduces a benchmark for evaluating large language models on structured data extraction, emphasizing the impact of promp...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

The paper presents CoCoDiff, a novel framework for fine-grained style transfer in images, emphasizing semantic correspondence and achievi...

arXiv - AI · 3 min · about 2 months ago

Generative Ai

[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion

This article presents an adaptation of VACE for real-time autoregressive video generation, enhancing video control while addressing laten...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.14374] Differentially Private Retrieval-Augmented Generation

The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addr...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

The paper presents a game-theoretic framework called Socially-Weighted Alignment (SWA) for managing multi-agent large language model (LLM...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

The paper presents LRD-MPC, a method that enhances the efficiency of secure multi-party computation (MPC) in machine learning by utilizin...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference

The paper presents Floe, a federated learning framework that enhances real-time inference of large language models (LLMs) while addressin...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking

The paper presents MILD, a proactive framework for failure prediction in intent-based networking, enhancing root-cause intent disambiguat...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14280] Fast Compute for ML Optimization

The paper presents the Scale Mixture EM (SM-EM) algorithm for optimizing machine learning losses, demonstrating significant performance i...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems

This article explores the use of Pinching Antenna Systems (PASSs) to enhance energy efficiency in over-the-air federated learning, presen...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

The paper presents STATe-of-Thoughts, a new method for improving output diversity and interpretability in inference-time compute methods,...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.14244] Federated Ensemble Learning with Progressive Model Personalization

This paper presents a novel framework for Federated Ensemble Learning that enhances model personalization while addressing statistical he...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14236] Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models

The paper presents Sali-Cache, a novel optimization framework for Vision-Language Models (VLMs) that addresses memory bottlenecks in long...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler

The paper introduces the Gaussian Thought Sampler (GTS), a novel approach to inference-time scaling in latent reasoning models, enhancing...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models

The paper presents Spherical Barycentric Aggregation (SBA), a new method for aggregating outputs in Mixture-of-Experts (MoE) embedding mo...

arXiv - Machine Learning · 3 min · about 2 months ago

Robotics

[2602.14117] Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

This article presents a multi-scale agentic AI framework for Open Radio Access Networks (O-RAN), enhancing real-time network control and ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models

This paper explores the integration of Large Language Models (LLMs) in anticipating adversary behavior within DevSecOps environments, pro...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

TabTracer introduces a novel Monte Carlo Tree Search framework for enhancing table reasoning in large language models, improving accuracy...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.13871] Ensemble-Conditional Gaussian Processes (Ens-CGP): Representation, Geometry, and Inference

The paper presents Ensemble-Conditional Gaussian Processes (Ens-CGP), linking ensemble inference with conditional Gaussian laws, enhancin...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.14010] A Deployment-Friendly Foundational Framework for Efficient Computational Pathology

This paper presents LitePath, a foundational framework for computational pathology that significantly reduces computational costs while m...

arXiv - AI · 4 min · about 2 months ago

Previous Page 161 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]

What if attention didn’t need matrix multiplication?

All Content

[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion

[2602.14374] Differentially Private Retrieval-Augmented Generation

[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference

[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking

[2602.14280] Fast Compute for ML Optimization

[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems

[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

[2602.14244] Federated Ensemble Learning with Progressive Model Personalization

[2602.14236] Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models

[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler

[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models

[2602.14117] Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models

[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

[2602.13871] Ensemble-Conditional Gaussian Processes (Ens-CGP): Representation, Geometry, and Inference

[2602.14010] A Deployment-Friendly Foundational Framework for Efficient Computational Pathology

Related Topics

Stay updated with AI News