AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

[2505.16051] Flow-based Generative Modeling of Potential Outcomes and Counterfactuals
Machine Learning

[2505.16051] Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

Abstract page for arXiv paper 2505.16051: Flow-based Generative Modeling of Potential Outcomes and Counterfactuals

arXiv - Machine Learning · 3 min ·
[2512.23748] A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ideal Data Scenarios
Machine Learning

[2512.23748] A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ideal Data Scenarios

Abstract page for arXiv paper 2512.23748: A Review of Diffusion-based Simulation-Based Inference: Foundations and Applications in Non-Ide...

arXiv - Machine Learning · 4 min ·
[2512.15742] SHARe-KAN: Post-Training Vector Quantization for Cache-Resident KAN Inference
Machine Learning

[2512.15742] SHARe-KAN: Post-Training Vector Quantization for Cache-Resident KAN Inference

Abstract page for arXiv paper 2512.15742: SHARe-KAN: Post-Training Vector Quantization for Cache-Resident KAN Inference

arXiv - Machine Learning · 4 min ·

All Content

[2406.12844] Synergizing Foundation Models and Federated Learning: A Survey
Llms

[2406.12844] Synergizing Foundation Models and Federated Learning: A Survey

This survey explores the integration of Foundation Models (FMs) and Federated Learning (FL), termed Federated Foundation Models (FedFM), ...

arXiv - AI · 4 min ·
[2402.15751] Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Llms

[2402.15751] Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

The paper introduces Sparse MeZO, a novel optimization technique for fine-tuning large language models (LLMs) that reduces memory usage w...

arXiv - AI · 4 min ·
[2602.14760] Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers
Llms

[2602.14760] Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers

This article explores a structural misalignment in Transformers, particularly regarding residual connections and their impact on next-tok...

arXiv - AI · 3 min ·
[2402.02644] Permutation-based Inference for Variational Learning of Directed Acyclic Graphs
Machine Learning

[2402.02644] Permutation-based Inference for Variational Learning of Directed Acyclic Graphs

This paper presents PIVID, a novel method for inferring distributions over permutations and directed acyclic graphs (DAGs) using variatio...

arXiv - Machine Learning · 3 min ·
[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
Llms

[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

This paper explores the efficiency of offline policy selection (OPS) in reinforcement learning, connecting it to off-policy evaluation (O...

arXiv - AI · 4 min ·
[2112.06251] Learning with Subset Stacking
Machine Learning

[2112.06251] Learning with Subset Stacking

The paper introduces a novel regression algorithm called Learning with Subset Stacking (LESS), which effectively learns from heterogeneou...

arXiv - Machine Learning · 3 min ·
[2602.14710] Orcheo: A Modular Full-Stack Platform for Conversational Search
Ai Startups

[2602.14710] Orcheo: A Modular Full-Stack Platform for Conversational Search

Orcheo is an open-source platform designed to streamline conversational search by offering a modular architecture, production-ready infra...

arXiv - AI · 3 min ·
[2602.14699] Qute: Towards Quantum-Native Database
Ai Infrastructure

[2602.14699] Qute: Towards Quantum-Native Database

The paper presents Qute, a quantum-native database that integrates quantum computation into database operations, enhancing performance ov...

arXiv - AI · 3 min ·
[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction
Llms

[2602.14743] LLMStructBench: Benchmarking Large Language Model Structured Data Extraction

LLMStructBench introduces a benchmark for evaluating large language models on structured data extraction, emphasizing the impact of promp...

arXiv - Machine Learning · 3 min ·
[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer
Machine Learning

[2602.14464] CoCoDiff: Correspondence-Consistent Diffusion Model for Fine-grained Style Transfer

The paper presents CoCoDiff, a novel framework for fine-grained style transfer in images, emphasizing semantic correspondence and achievi...

arXiv - AI · 3 min ·
[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion
Generative Ai

[2602.14381] Adapting VACE for Real-Time Autoregressive Video Diffusion

This article presents an adaptation of VACE for real-time autoregressive video generation, enhancing video control while addressing laten...

arXiv - AI · 3 min ·
[2602.14374] Differentially Private Retrieval-Augmented Generation
Llms

[2602.14374] Differentially Private Retrieval-Augmented Generation

The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addr...

arXiv - AI · 4 min ·
[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems
Llms

[2602.14471] Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

The paper presents a game-theoretic framework called Socially-Weighted Alignment (SWA) for managing multi-agent large language model (LLM...

arXiv - AI · 3 min ·
[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition
Machine Learning

[2602.14397] LRD-MPC: Efficient MPC Inference through Low-rank Decomposition

The paper presents LRD-MPC, a method that enhances the efficiency of secure multi-party computation (MPC) in machine learning by utilizin...

arXiv - Machine Learning · 4 min ·
[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference
Llms

[2602.14302] Floe: Federated Specialization for Real-Time LLM-SLM Inference

The paper presents Floe, a federated learning framework that enhances real-time inference of large language models (LLMs) while addressin...

arXiv - Machine Learning · 3 min ·
[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking
Machine Learning

[2602.14283] MILD: Multi-Intent Learning and Disambiguation for Proactive Failure Prediction in Intent-based Networking

The paper presents MILD, a proactive framework for failure prediction in intent-based networking, enhancing root-cause intent disambiguat...

arXiv - Machine Learning · 3 min ·
[2602.14280] Fast Compute for ML Optimization
Machine Learning

[2602.14280] Fast Compute for ML Optimization

The paper presents the Scale Mixture EM (SM-EM) algorithm for optimizing machine learning losses, demonstrating significant performance i...

arXiv - Machine Learning · 3 min ·
[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems
Machine Learning

[2602.14250] Energy-Efficient Over-the-Air Federated Learning via Pinching Antenna Systems

This article explores the use of Pinching Antenna Systems (PASSs) to enhance energy efficiency in over-the-air federated learning, presen...

arXiv - Machine Learning · 3 min ·
[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts
Machine Learning

[2602.14265] STATe-of-Thoughts: Structured Action Templates for Tree-of-Thoughts

The paper presents STATe-of-Thoughts, a new method for improving output diversity and interpretability in inference-time compute methods,...

arXiv - Machine Learning · 4 min ·
[2602.14244] Federated Ensemble Learning with Progressive Model Personalization
Machine Learning

[2602.14244] Federated Ensemble Learning with Progressive Model Personalization

This paper presents a novel framework for Federated Ensemble Learning that enhances model personalization while addressing statistical he...

arXiv - Machine Learning · 4 min ·
Previous Page 164 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime