AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Machine Learning

Trials and tribulations fine-tuning & deploying Gemma-4 [P]

Hey all, Our ML team spent some time this week getting training and deployments working for Gemma-4, and wanted to document all the thing...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

I am a solo developer who has been using all three seriously. Here is what I actually think: GPT-4o — Strengths: Large context window, st...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

All Content

Llms

[2602.14234] REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

The paper presents REDSearcher, a novel framework designed to optimize long-horizon search agents by addressing the challenges of task sy...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.14130] Algebraic Quantum Intelligence: A New Framework for Reproducible Machine Creativity

The paper introduces Algebraic Quantum Intelligence (AQI), a framework designed to enhance the creative capabilities of large language mo...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2602.14095] NEST: Nascent Encoded Steganographic Thoughts

The paper 'NEST: Nascent Encoded Steganographic Thoughts' explores the potential for large language models (LLMs) to conceal reasoning wi...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.14083] Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation

The article presents Plan-MCTS, a novel framework for enhancing web navigation through improved exploration and state perception, address...

arXiv - AI · 3 min · 2 months ago

Machine Learning

[2602.13710] HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

The paper presents HBVLA, a framework for 1-bit post-training quantization of Vision-Language-Action models, enhancing efficiency while m...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.14003] Prompt-Driven Low-Altitude Edge Intelligence: Modular Agents and Generative Reasoning

The paper presents a novel framework for low-altitude edge intelligence, addressing limitations of large AI models through a prompt-to-ag...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13980] Cognitive Chunking for Soft Prompts: Accelerating Compressor Learning via Block-wise Causal Masking

This article presents a novel method called Parallelized Iterative Compression (PIC) for enhancing soft prompt compression in Large Langu...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2602.13967] Neuromem: A Granular Decomposition of the Streaming Lifecycle in External Memory for LLMs

The paper presents Neuromem, a framework for evaluating external memory modules in large language models (LLMs) under a dynamic streaming...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13933] HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling

The paper presents HyMem, a hybrid memory architecture designed to enhance the performance of large language models (LLMs) in extended di...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13659] Zero-Order Optimization for LLM Fine-Tuning via Learnable Direction Sampling

This article presents a novel zero-order optimization framework for fine-tuning large language models (LLMs) using learnable direction sa...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.13557] Scenario-Adaptive MU-MIMO OFDM Semantic Communication With Asymmetric Neural Network

This paper presents a scenario-adaptive framework for MU-MIMO OFDM semantic communication, addressing challenges in multi-user environmen...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2602.13804] Attention in Constant Time: Vashista Sparse Attention for Long-Context Decoding with Exponential Guarantees

The paper presents Vashista Sparse Attention, a novel mechanism for efficient long-context decoding in large language models, ensuring co...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.13532] Fast Swap-Based Element Selection for Multiplication-Free Dimension Reduction

This paper presents a fast algorithm for element selection in dimension reduction, eliminating multiplication to enhance efficiency in re...

arXiv - Machine Learning · 4 min · 2 months ago

Llms

[2602.13792] StackingNet: Collective Inference Across Independent AI Foundation Models

StackingNet introduces a meta-ensemble framework that enhances the coordination of independent AI foundation models, improving accuracy, ...

arXiv - AI · 3 min · 2 months ago

Machine Learning

[2602.13738] OneLatent: Single-Token Compression for Visual Latent Reasoning

The paper introduces OneLatent, a framework that compresses reasoning in visual tasks into a single token, significantly reducing output ...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.13697] No Need to Train Your RDB Foundation Model

The paper presents a novel approach to utilizing relational databases (RDBs) for predictive modeling without the need for retraining mode...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2602.13498] TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

TrasMuon introduces a novel optimization technique that enhances the stability and efficiency of orthogonalized momentum optimizers, outp...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.13486] Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

This paper introduces raFLoRA, a method to prevent rank collapse in federated low-rank adaptation (FedLoRA) due to client heterogeneity, ...

arXiv - AI · 3 min · 2 months ago

Llms

[2602.13695] Can a Lightweight Automated AI Pipeline Solve Research-Level Mathematical Problems?

This article explores the potential of a lightweight AI pipeline to solve complex mathematical problems, demonstrating its effectiveness ...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13680] AllMem: A Memory-centric Recipe for Efficient Long-context Modeling

The paper presents AllMem, a memory-centric architecture designed to enhance the efficiency of long-context modeling in large language mo...

arXiv - AI · 4 min · 2 months ago

Previous Page 174 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

Trials and tribulations fine-tuning & deploying Gemma-4 [P]

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

All Content

[2602.14234] REDSearcher: A Scalable and Cost-Efficient Framework for Long-Horizon Search Agents

[2602.14130] Algebraic Quantum Intelligence: A New Framework for Reproducible Machine Creativity

[2602.14095] NEST: Nascent Encoded Steganographic Thoughts

[2602.14083] Plan-MCTS: Plan Exploration for Action Exploitation in Web Navigation

[2602.13710] HBVLA: Pushing 1-Bit Post-Training Quantization for Vision-Language-Action Models

[2602.14003] Prompt-Driven Low-Altitude Edge Intelligence: Modular Agents and Generative Reasoning

[2602.13980] Cognitive Chunking for Soft Prompts: Accelerating Compressor Learning via Block-wise Causal Masking

[2602.13967] Neuromem: A Granular Decomposition of the Streaming Lifecycle in External Memory for LLMs

[2602.13933] HyMem: Hybrid Memory Architecture with Dynamic Retrieval Scheduling

[2602.13659] Zero-Order Optimization for LLM Fine-Tuning via Learnable Direction Sampling

[2602.13557] Scenario-Adaptive MU-MIMO OFDM Semantic Communication With Asymmetric Neural Network

[2602.13804] Attention in Constant Time: Vashista Sparse Attention for Long-Context Decoding with Exponential Guarantees

[2602.13532] Fast Swap-Based Element Selection for Multiplication-Free Dimension Reduction

[2602.13792] StackingNet: Collective Inference Across Independent AI Foundation Models

[2602.13738] OneLatent: Single-Token Compression for Visual Latent Reasoning

[2602.13697] No Need to Train Your RDB Foundation Model

[2602.13498] TrasMuon: Trust-Region Adaptive Scaling for Orthogonalized Momentum Optimizers

[2602.13486] Preventing Rank Collapse in Federated Low-Rank Adaptation with Client Heterogeneity

[2602.13695] Can a Lightweight Automated AI Pipeline Solve Research-Level Mathematical Problems?

[2602.13680] AllMem: A Memory-centric Recipe for Efficient Long-context Modeling

Related Topics

Stay updated with AI News