AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Llms

Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]

Hey guys, I'm an independent researcher working on a project that tries to address a very specific failure mode in LLMs and embedding bas...

Reddit - Machine Learning · 1 min ·
Machine Learning

What if attention didn’t need matrix multiplication?

I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-po...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction
Llms

[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction

LLMStructBench introduces a benchmark for evaluating large language models on structured data extraction, emphasizing the impact of promp...

arXiv - Machine Learning · 3 min ·
[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer
Machine Learning

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

The paper presents CoCoDiff, a novel framework for fine-grained style transfer in images, emphasizing semantic correspondence and achievi...

arXiv - AI · 3 min ·
[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion
Generative Ai

[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion

This article presents an adaptation of VACE for real-time autoregressive video generation, enhancing video control while addressing laten...

arXiv - AI · 3 min ·
[2602.14374] Differentially Private Retrieval-Augmented Generation
Llms

[2602.14374] Differentially Private Retrieval-Augmented Generation

The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addr...

arXiv - AI · 4 min ·
[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems
Llms

[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

The paper presents a game-theoretic framework called Socially-Weighted Alignment (SWA) for managing multi-agent large language model (LLM...

arXiv - AI · 3 min ·
[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition
Machine Learning

[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

The paper presents LRD-MPC, a method that enhances the efficiency of secure multi-party computation (MPC) in machine learning by utilizin...

arXiv - Machine Learning · 4 min ·
[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference
Llms

[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference

The paper presents Floe, a federated learning framework that enhances real-time inference of large language models (LLMs) while addressin...

arXiv - Machine Learning · 3 min ·
[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking
Machine Learning

[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking

The paper presents MILD, a proactive framework for failure prediction in intent-based networking, enhancing root-cause intent disambiguat...

arXiv - Machine Learning · 3 min ·
[2602.14280] Fast Compute for ML Optimization
Machine Learning

[2602.14280] Fast Compute for ML Optimization

The paper presents the Scale Mixture EM (SM-EM) algorithm for optimizing machine learning losses, demonstrating significant performance i...

arXiv - Machine Learning · 3 min ·
[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems
Machine Learning

[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems

This article explores the use of Pinching Antenna Systems (PASSs) to enhance energy efficiency in over-the-air federated learning, presen...

arXiv - Machine Learning · 3 min ·
[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
Machine Learning

[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

The paper presents STATe-of-Thoughts, a new method for improving output diversity and interpretability in inference-time compute methods,...

arXiv - Machine Learning · 4 min ·
[2602.14244] Federated Ensemble Learning with Progressive Model Personalization
Machine Learning

[2602.14244] Federated Ensemble Learning with Progressive Model Personalization

This paper presents a novel framework for Federated Ensemble Learning that enhances model personalization while addressing statistical he...

arXiv - Machine Learning · 4 min ·
[2602.14236] Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models
Llms

[2602.14236] Dual-Signal Adaptive KV-Cache Optimization for Long-Form Video Understanding in Vision-Language Models

The paper presents Sali-Cache, a novel optimization framework for Vision-Language Models (VLMs) that addresses memory bottlenecks in long...

arXiv - AI · 3 min ·
[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler
Machine Learning

[2602.14077] GTS: Inference-Time Scaling of Latent Reasoning with a Learnable Gaussian Thought Sampler

The paper introduces the Gaussian Thought Sampler (GTS), a novel approach to inference-time scaling in latent reasoning models, enhancing...

arXiv - Machine Learning · 3 min ·
[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models
Machine Learning

[2602.14039] Geometry-Preserving Aggregation for Mixture-of-Experts Embedding Models

The paper presents Spherical Barycentric Aggregation (SBA), a new method for aggregating outputs in Mixture-of-Experts (MoE) embedding mo...

arXiv - Machine Learning · 3 min ·
[2602.14117] Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management
Robotics

[2602.14117] Toward Autonomous O-RAN: A Multi-Scale Agentic AI Framework for Real-Time Network Control and Management

This article presents a multi-scale agentic AI framework for Open Radio Access Networks (O-RAN), enhancing real-time network control and ...

arXiv - AI · 4 min ·
[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models
Llms

[2602.14106] Anticipating Adversary Behavior in DevSecOps Scenarios through Large Language Models

This paper explores the integration of Large Language Models (LLMs) in anticipating adversary behavior within DevSecOps environments, pro...

arXiv - AI · 4 min ·
[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models
Llms

[2602.14089] TabTracer: Monte Carlo Tree Search for Complex Table Reasoning with Large Language Models

TabTracer introduces a novel Monte Carlo Tree Search framework for enhancing table reasoning in large language models, improving accuracy...

arXiv - AI · 4 min ·
[2602.13871] Ensemble-Conditional Gaussian Processes (Ens-CGP): Representation, Geometry, and Inference
Machine Learning

[2602.13871] Ensemble-Conditional Gaussian Processes (Ens-CGP): Representation, Geometry, and Inference

The paper presents Ensemble-Conditional Gaussian Processes (Ens-CGP), linking ensemble inference with conditional Gaussian laws, enhancin...

arXiv - Machine Learning · 4 min ·
[2602.14010] A Deployment-Friendly Foundational Framework for Efficient Computational Pathology
Llms

[2602.14010] A Deployment-Friendly Foundational Framework for Efficient Computational Pathology

This paper presents LitePath, a foundational framework for computational pathology that significantly reduces computational costs while m...

arXiv - AI · 4 min ·
Previous Page 161 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime