AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Llms

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

https://arxiv.org/abs/2604.05091 Abstract: "We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large l...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

"There's a green field." Five words, no system prompt, pure autocomplete. It figured out what it was.

No chat interface. No identity. No instructions. Just the API in raw autocomplete mode. The model receives text, predicts the next tokens...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Why would Anthropic keep a cyber model like Project Glasswing invite-only?

Anthropic’s Project Glasswing caught my attention less as a cybersecurity headline than as a signal about how frontier AI may be commerci...

Reddit - Artificial Intelligence · 1 min ·

All Content

[2602.17244] CounterFlowNet: From Minimal Changes to Meaningful Counterfactual Explanations
Machine Learning

[2602.17244] CounterFlowNet: From Minimal Changes to Meaningful Counterfactual Explanations

CounterFlowNet introduces a novel generative approach for creating counterfactual explanations in machine learning, enhancing interpretab...

arXiv - Machine Learning · 3 min ·
[2602.17206] SoftDTW-CUDA-Torch: Memory-Efficient GPU-Accelerated Soft Dynamic Time Warping for PyTorch
Ai Infrastructure

[2602.17206] SoftDTW-CUDA-Torch: Memory-Efficient GPU-Accelerated Soft Dynamic Time Warping for PyTorch

The paper presents SoftDTW-CUDA-Torch, an open-source PyTorch library that enhances Soft Dynamic Time Warping (SoftDTW) by improving memo...

arXiv - Machine Learning · 3 min ·
[2602.17155] Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization
Machine Learning

[2602.17155] Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

The paper introduces ZO-Muon, a novel zeroth-order optimization method that enhances convergence speed and accuracy in training large-sca...

arXiv - Machine Learning · 4 min ·
[2602.17102] Operationalization of Machine Learning with Serverless Architecture: An Industrial Operationalization of Machine Learning with Serverless Architecture: An Industrial Implementation for Harmonized System Code Prediction
Machine Learning

[2602.17102] Operationalization of Machine Learning with Serverless Architecture: An Industrial Operationalization of Machine Learning with Serverless Architecture: An Industrial Implementation for Harmonized System Code Prediction

This paper presents a serverless MLOps framework for the complete ML lifecycle, focusing on Harmonized System code prediction, achieving ...

arXiv - Machine Learning · 4 min ·
[2602.17050] Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders
Machine Learning

[2602.17050] Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders

The paper presents the Multi-Probe Zero Collision Hash (MPZCH), a novel indexing method that mitigates embedding collisions in large-scal...

arXiv - Machine Learning · 4 min ·
[2602.17004] Arcee Trinity Large Technical Report
Machine Learning

[2602.17004] Arcee Trinity Large Technical Report

The Arcee Trinity Large Technical Report presents a new sparse Mixture-of-Experts model with 400 billion parameters, detailing its archit...

arXiv - Machine Learning · 4 min ·
[2602.16980] Discovering Universal Activation Directions for PII Leakage in Language Models
Llms

[2602.16980] Discovering Universal Activation Directions for PII Leakage in Language Models

The paper introduces UniLeak, a framework that identifies universal activation directions in language models, enhancing the understanding...

arXiv - Machine Learning · 3 min ·
[2602.16796] Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning
Machine Learning

[2602.16796] Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

This article presents Tail-aware Flow Fine-Tuning (TFFT), a novel algorithm that optimizes generative models by controlling tail behavior...

arXiv - Machine Learning · 4 min ·
[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models
Machine Learning

[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

The paper presents a novel inference pipeline that leverages off-the-shelf models to solve International Mathematical Olympiad problems e...

arXiv - Machine Learning · 4 min ·
[2602.10993] LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules
Machine Learning

[2602.10993] LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

The paper introduces LoRA-Squeeze, a method for improving Low-Rank Adaptation (LoRA) by allowing dynamic rank adjustments during training...

arXiv - AI · 4 min ·
[2602.08351] The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs
Llms

[2602.08351] The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs

The paper discusses the challenge of co-optimizing data and model configurations for training large language models (LLMs), introducing a...

arXiv - AI · 4 min ·
[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students
Generative Ai

[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

This study audits the collaboration between online graduate CS students and AI, exploring preferences for automation in academic tasks an...

arXiv - AI · 3 min ·
[2511.19943] AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload
Machine Learning

[2511.19943] AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload

This paper presents a novel approach to joint source and channel coding for HARQ-ACK payloads using AI/ML techniques, demonstrating signi...

arXiv - Machine Learning · 4 min ·
[2511.15162] Multimodal Wireless Foundation Models
Llms

[2511.15162] Multimodal Wireless Foundation Models

The paper introduces Multimodal Wireless Foundation Models (WFMs) that integrate multiple data modalities, enhancing wireless function pe...

arXiv - Machine Learning · 4 min ·
[2510.14974] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation
Machine Learning

[2510.14974] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation

The paper presents pi-Flow, a novel approach to few-step generation in machine learning that utilizes imitation distillation to enhance m...

arXiv - AI · 4 min ·
[2510.03352] Inference-Time Search Using Side Information for Diffusion-Based Image Reconstruction
Machine Learning

[2510.03352] Inference-Time Search Using Side Information for Diffusion-Based Image Reconstruction

This article presents a novel inference-time search algorithm that enhances diffusion-based image reconstruction by utilizing side inform...

arXiv - Machine Learning · 4 min ·
[2509.22075] CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Llms

[2509.22075] CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning

The paper introduces CoSpaDi, a novel framework for compressing large language models (LLMs) using calibration-guided sparse dictionary l...

arXiv - AI · 4 min ·
[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks
Llms

[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

The MCIF benchmark introduces a novel framework for evaluating multimodal crosslingual instruction-following capabilities in large langua...

arXiv - AI · 4 min ·
[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization
Machine Learning

[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

The paper presents ReplaceMe, a novel method for network simplification that utilizes depth pruning and transformer block linearization, ...

arXiv - Machine Learning · 4 min ·
[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation
Llms

[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation

This paper presents a knowledge distillation approach for Multi-View 3D reconstruction, utilizing a teacher-student model framework to en...

arXiv - Machine Learning · 4 min ·
Previous Page 119 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime