AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Llms

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

https://arxiv.org/abs/2604.05091 Abstract: "We present MegaTrain, a memory-centric system that efficiently trains 100B+ parameter large l...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

"There's a green field." Five words, no system prompt, pure autocomplete. It figured out what it was.

No chat interface. No identity. No instructions. Just the API in raw autocomplete mode. The model receives text, predicts the next tokens...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

Why would Anthropic keep a cyber model like Project Glasswing invite-only?

Anthropic’s Project Glasswing caught my attention less as a cybersecurity headline than as a signal about how frontier AI may be commerci...

Reddit - Artificial Intelligence · 1 min · about 8 hours ago

All Content

Machine Learning

[2602.17244] CounterFlowNet: From Minimal Changes to Meaningful Counterfactual Explanations

CounterFlowNet introduces a novel generative approach for creating counterfactual explanations in machine learning, enhancing interpretab...

arXiv - Machine Learning · 3 min · about 2 months ago

Ai Infrastructure

[2602.17206] SoftDTW-CUDA-Torch: Memory-Efficient GPU-Accelerated Soft Dynamic Time Warping for PyTorch

The paper presents SoftDTW-CUDA-Torch, an open-source PyTorch library that enhances Soft Dynamic Time Warping (SoftDTW) by improving memo...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17155] Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

The paper introduces ZO-Muon, a novel zeroth-order optimization method that enhances convergence speed and accuracy in training large-sca...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17102] Operationalization of Machine Learning with Serverless Architecture: An Industrial Operationalization of Machine Learning with Serverless Architecture: An Industrial Implementation for Harmonized System Code Prediction

This paper presents a serverless MLOps framework for the complete ML lifecycle, focusing on Harmonized System code prediction, achieving ...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17050] Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders

The paper presents the Multi-Probe Zero Collision Hash (MPZCH), a novel indexing method that mitigates embedding collisions in large-scal...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.17004] Arcee Trinity Large Technical Report

The Arcee Trinity Large Technical Report presents a new sparse Mixture-of-Experts model with 400 billion parameters, detailing its archit...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.16980] Discovering Universal Activation Directions for PII Leakage in Language Models

The paper introduces UniLeak, a framework that identifies universal activation directions in language models, enhancing the understanding...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.16796] Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

This article presents Tail-aware Flow Fine-Tuning (TFFT), a novel algorithm that optimizes generative models by controlling tail behavior...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

The paper presents a novel inference pipeline that leverages off-the-shelf models to solve International Mathematical Olympiad problems e...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.10993] LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

The paper introduces LoRA-Squeeze, a method for improving Low-Rank Adaptation (LoRA) by allowing dynamic rank adjustments during training...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.08351] The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs

The paper discusses the challenge of co-optimizing data and model configurations for training large language models (LLMs), introducing a...

arXiv - AI · 4 min · about 2 months ago

Generative Ai

[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

This study audits the collaboration between online graduate CS students and AI, exploring preferences for automation in academic tasks an...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2511.19943] AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload

This paper presents a novel approach to joint source and channel coding for HARQ-ACK payloads using AI/ML techniques, demonstrating signi...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2511.15162] Multimodal Wireless Foundation Models

The paper introduces Multimodal Wireless Foundation Models (WFMs) that integrate multiple data modalities, enhancing wireless function pe...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2510.14974] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation

The paper presents pi-Flow, a novel approach to few-step generation in machine learning that utilizes imitation distillation to enhance m...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2510.03352] Inference-Time Search Using Side Information for Diffusion-Based Image Reconstruction

This article presents a novel inference-time search algorithm that enhances diffusion-based image reconstruction by utilizing side inform...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2509.22075] CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning

The paper introduces CoSpaDi, a novel framework for compressing large language models (LLMs) using calibration-guided sparse dictionary l...

arXiv - AI · 4 min · about 2 months ago

Llms

[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

The MCIF benchmark introduces a novel framework for evaluating multimodal crosslingual instruction-following capabilities in large langua...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

The paper presents ReplaceMe, a novel method for network simplification that utilizes depth pruning and transformer block linearization, ...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation

This paper presents a knowledge distillation approach for Multi-View 3D reconstruction, utilizing a teacher-student model framework to en...

arXiv - Machine Learning · 4 min · about 2 months ago

Previous Page 119 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU

"There's a green field." Five words, no system prompt, pure autocomplete. It figured out what it was.

Why would Anthropic keep a cyber model like Project Glasswing invite-only?

All Content

[2602.17244] CounterFlowNet: From Minimal Changes to Meaningful Counterfactual Explanations

[2602.17206] SoftDTW-CUDA-Torch: Memory-Efficient GPU-Accelerated Soft Dynamic Time Warping for PyTorch

[2602.17155] Powering Up Zeroth-Order Training via Subspace Gradient Orthogonalization

[2602.17102] Operationalization of Machine Learning with Serverless Architecture: An Industrial Operationalization of Machine Learning with Serverless Architecture: An Industrial Implementation for Harmonized System Code Prediction

[2602.17050] Multi-Probe Zero Collision Hash (MPZCH): Mitigating Embedding Collisions and Enhancing Model Freshness in Large-Scale Recommenders

[2602.17004] Arcee Trinity Large Technical Report

[2602.16980] Discovering Universal Activation Directions for PII Leakage in Language Models

[2602.16796] Efficient Tail-Aware Generative Optimization via Flow Model Fine-Tuning

[2602.16793] Escaping the Cognitive Well: Efficient Competition Math with Off-the-Shelf Models

[2602.10993] LoRA-Squeeze: Simple and Effective Post-Tuning and In-Tuning Compression of LoRA Modules

[2602.08351] The Chicken and Egg Dilemma: Co-optimizing Data and Model Configurations for LLMs

[2601.08697] Auditing Student-AI Collaboration: A Case Study of Online Graduate CS Students

[2511.19943] AI/ML based Joint Source and Channel Coding for HARQ-ACK Payload

[2511.15162] Multimodal Wireless Foundation Models

[2510.14974] pi-Flow: Policy-Based Few-Step Generation via Imitation Distillation

[2510.03352] Inference-Time Search Using Side Information for Diffusion-Based Image Reconstruction

[2509.22075] CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning

[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation

Related Topics

Stay updated with AI News