AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 5 hours ago

Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min · about 9 hours ago

Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min · about 14 hours ago

All Content

Llms

[2602.22217] RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

The paper presents RAGdb, a novel architecture for Retrieval-Augmented Generation (RAG) that simplifies multimodal data processing by eli...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.23111] PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training

The paper presents PRAC, a novel method for compressing activations in large language models, achieving significant memory savings while ...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

This article presents a multi-agent LLM framework for financial trading, emphasizing fine-grained task decomposition to enhance decision-...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.23315] Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction

This article presents a novel approach to reducing epistemic uncertainty in AI models through invariant transformation and resampling tec...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.23035] Learning Disease-Sensitive Latent Interaction Graphs From Noisy Cardiac Flow Measurements

This paper presents a novel framework for modeling cardiac blood flow patterns using disease-sensitive latent interaction graphs, enhanci...

arXiv - Machine Learning · 4 min · about 1 month ago

Ai Infrastructure

[2602.23271] Evaluating Stochasticity in Deep Research Agents

This paper evaluates the stochasticity in Deep Research Agents (DRAs), highlighting how variability in their outputs can impact research ...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.23248] Mitigating Legibility Tax with Decoupled Prover-Verifier Games

This paper presents a novel approach to mitigate the 'legibility tax' in large language models by decoupling the prover-verifier game, al...

arXiv - AI · 3 min · about 1 month ago

Machine Learning

[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

This paper explores generalization bounds for Stochastic Gradient Descent (SGD) in homogeneous neural networks, revealing that slower ste...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22911] NoRA: Breaking the Linear Ceiling of Low-Rank Adaptation via Manifold Expansion

The paper introduces NoRA, a novel approach to Low-Rank Adaptation (LoRA) that overcomes the limitations of linear methods by utilizing m...

arXiv - Machine Learning · 3 min · about 1 month ago

Ai Infrastructure

[2602.22882] Fair feature attribution for multi-output prediction: a Shapley-based perspective

This article presents a Shapley-based framework for fair feature attribution in multi-output prediction, addressing the limitations of ex...

arXiv - Machine Learning · 3 min · about 1 month ago

Llms

[2602.23193] ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

The paper presents ESAA, an architecture for autonomous agents using event sourcing to enhance state management and execution in LLM-base...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence

This paper explores how contextual influences affect the moral decision-making of large language models (LLMs) in scenarios akin to troll...

arXiv - AI · 4 min · about 1 month ago

Llms

[2602.22812] Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

The paper presents a method for enhancing the performance of local large language models (LLMs) on resource-constrained edge devices thro...

arXiv - Machine Learning · 3 min · about 1 month ago

Machine Learning

[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots

The paper presents a framework for improving AI diagnostic alignment in clinical settings by preserving AI-generated reports as immutable...

arXiv - AI · 4 min · about 1 month ago

Machine Learning

[2602.22968] Certified Circuits: Stability Guarantees for Mechanistic Circuits

The paper introduces Certified Circuits, a framework that enhances the stability and accuracy of circuit discovery in neural networks, ad...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

FactGuard introduces an innovative framework for detecting video misinformation using reinforcement learning, enhancing the capabilities ...

arXiv - AI · 3 min · about 1 month ago

Llms

[2602.22642] Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning

This paper introduces a novel approach called CEEH, which combines difficulty-aware entropy regularization with reinforcement learning to...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22633] Tackling Privacy Heterogeneity in Differentially Private Federated Learning

This article presents a novel approach to address privacy heterogeneity in differentially private federated learning (DP-FL), proposing a...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22822] FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

FlexMS is a new framework designed for benchmarking deep learning models used in mass spectrum prediction within metabolomics, addressing...

arXiv - Machine Learning · 4 min · about 1 month ago

Machine Learning

[2602.22611] Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD

This paper presents Layer-wise MIA-risk-aware DP-SGD, a method to reduce Membership Inference Attack risks in machine learning models by ...

arXiv - Machine Learning · 4 min · about 1 month ago

Previous Page 72 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence

Your prompts aren’t the problem — something else is

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

All Content

[2602.22217] RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

[2602.23111] PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training

[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

[2602.23315] Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction

[2602.23035] Learning Disease-Sensitive Latent Interaction Graphs From Noisy Cardiac Flow Measurements

[2602.23271] Evaluating Stochasticity in Deep Research Agents

[2602.23248] Mitigating Legibility Tax with Decoupled Prover-Verifier Games

[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

[2602.22911] NoRA: Breaking the Linear Ceiling of Low-Rank Adaptation via Manifold Expansion

[2602.22882] Fair feature attribution for multi-output prediction: a Shapley-based perspective

[2602.23193] ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence

[2602.22812] Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots

[2602.22968] Certified Circuits: Stability Guarantees for Mechanistic Circuits

[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

[2602.22642] Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning

[2602.22633] Tackling Privacy Heterogeneity in Differentially Private Federated Learning

[2602.22822] FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

[2602.22611] Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD

Related Topics

Stay updated with AI News