AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Llms

Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]

Hey guys, I'm an independent researcher working on a project that tries to address a very specific failure mode in LLMs and embedding bas...

Reddit - Machine Learning · 1 min ·
Machine Learning

What if attention didn’t need matrix multiplication?

I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-po...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

I'm profoundly ambivalent re: how to feel about this; is it great -- what a scrappy, bold pivot! Or wildly dumb - its so far from their c...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2505.08145] A Generalized Hierarchical Federated Learning Framework with Theoretical Guarantees
Machine Learning

[2505.08145] A Generalized Hierarchical Federated Learning Framework with Theoretical Guarantees

This article presents a novel Multi-Layer Hierarchical Federated Learning framework (QMLHFL) that enhances scalability and flexibility in...

arXiv - Machine Learning · 4 min ·
[2505.06795] Sparse Latent Factor Forecaster (SLFF) with Iterative Inference for Transparent Multi-Horizon Commodity Futures Prediction
Machine Learning

[2505.06795] Sparse Latent Factor Forecaster (SLFF) with Iterative Inference for Transparent Multi-Horizon Commodity Futures Prediction

The Sparse Latent Factor Forecaster (SLFF) proposes a new approach for predicting commodity futures by addressing forecast errors and enh...

arXiv - AI · 4 min ·
[2411.06403] Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach
Machine Learning

[2411.06403] Mastering NIM and Impartial Games with Weak Neural Networks: An AlphaZero-inspired Multi-Frame Approach

This paper explores the application of weak neural networks in mastering impartial games like NIM, utilizing an AlphaZero-inspired multi-...

arXiv - AI · 4 min ·
[2503.08796] Robust Multi-Objective Controlled Decoding of Large Language Models
Llms

[2503.08796] Robust Multi-Objective Controlled Decoding of Large Language Models

This article presents Robust Multi-Objective Decoding (RMOD), an innovative algorithm designed to enhance the performance of Large Langua...

arXiv - AI · 3 min ·
[2502.05376] LO-BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference
Llms

[2502.05376] LO-BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference

The paper presents LO-BCQ, a novel block clustered quantization method for 4-bit LLM inference, achieving less than 1% accuracy loss whil...

arXiv - Machine Learning · 4 min ·
[2602.14917] BFS-PO: Best-First Search for Large Reasoning Models
Machine Learning

[2602.14917] BFS-PO: Best-First Search for Large Reasoning Models

The paper proposes BFS-PO, a new reinforcement learning algorithm that enhances the performance of Large Reasoning Models by reducing com...

arXiv - AI · 3 min ·
[2501.16178] SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting
Llms

[2501.16178] SWIFT: Mapping Sub-series with Wavelet Decomposition Improves Time Series Forecasting

The paper presents SWIFT, a lightweight model that enhances time series forecasting using wavelet decomposition, achieving state-of-the-a...

arXiv - Machine Learning · 4 min ·
[2501.15889] Adaptive Width Neural Networks
Machine Learning

[2501.15889] Adaptive Width Neural Networks

The paper introduces Adaptive Width Neural Networks, a novel approach that optimizes the width of neural network layers during training, ...

arXiv - AI · 4 min ·
[2501.05633] Regularized Top-$k$: A Bayesian Framework for Gradient Sparsification
Machine Learning

[2501.05633] Regularized Top-$k$: A Bayesian Framework for Gradient Sparsification

The paper presents a Bayesian framework for gradient sparsification called Regularized Top-k (RegTop-k), which improves convergence in di...

arXiv - Machine Learning · 4 min ·
[2411.08982] Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection
Machine Learning

[2411.08982] Lynx: Enabling Efficient MoE Inference through Dynamic Batch-Aware Expert Selection

The paper introduces Lynx, a system designed to enhance the efficiency of Mixture-of-Expert (MoE) models by implementing dynamic batch-aw...

arXiv - Machine Learning · 4 min ·
[2411.16085] Cautious Optimizers: Improving Training with One Line of Code
Machine Learning

[2411.16085] Cautious Optimizers: Improving Training with One Line of Code

This article presents a new approach to optimizing training in machine learning by introducing a simple one-line modification to existing...

arXiv - AI · 3 min ·
[2410.10481] Model-based Large Language Model Customization as Service
Llms

[2410.10481] Model-based Large Language Model Customization as Service

The paper presents Llamdex, a framework for customizing large language models (LLMs) as a service, allowing clients to upload domain-spec...

arXiv - AI · 4 min ·
[2406.12844] Synergizing Foundation Models and Federated Learning: A Survey
Llms

[2406.12844] Synergizing Foundation Models and Federated Learning: A Survey

This survey explores the integration of Foundation Models (FMs) and Federated Learning (FL), termed Federated Foundation Models (FedFM), ...

arXiv - AI · 4 min ·
[2402.15751] Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Llms

[2402.15751] Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning

The paper introduces Sparse MeZO, a novel optimization technique for fine-tuning large language models (LLMs) that reduces memory usage w...

arXiv - AI · 4 min ·
[2602.14760] Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers
Llms

[2602.14760] Residual Connections and the Causal Shift: Uncovering a Structural Misalignment in Transformers

This article explores a structural misalignment in Transformers, particularly regarding residual connections and their impact on next-tok...

arXiv - AI · 3 min ·
[2402.02644] Permutation-based Inference for Variational Learning of Directed Acyclic Graphs
Machine Learning

[2402.02644] Permutation-based Inference for Variational Learning of Directed Acyclic Graphs

This paper presents PIVID, a novel method for inferring distributions over permutations and directed acyclic graphs (DAGs) using variatio...

arXiv - Machine Learning · 3 min ·
[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?
Llms

[2312.02355] When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

This paper explores the efficiency of offline policy selection (OPS) in reinforcement learning, connecting it to off-policy evaluation (O...

arXiv - AI · 4 min ·
[2112.06251] Learning with Subset Stacking
Machine Learning

[2112.06251] Learning with Subset Stacking

The paper introduces a novel regression algorithm called Learning with Subset Stacking (LESS), which effectively learns from heterogeneou...

arXiv - Machine Learning · 3 min ·
[2602.14710] Orcheo: A Modular Full-Stack Platform for Conversational Search
Ai Startups

[2602.14710] Orcheo: A Modular Full-Stack Platform for Conversational Search

Orcheo is an open-source platform designed to streamline conversational search by offering a modular architecture, production-ready infra...

arXiv - AI · 3 min ·
[2602.14699] Qute: Towards Quantum-Native Database
Ai Infrastructure

[2602.14699] Qute: Towards Quantum-Native Database

The paper presents Qute, a quantum-native database that integrates quantum computation into database operations, enhancing performance ov...

arXiv - AI · 3 min ·
Previous Page 160 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime