AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min ·

All Content

[2602.22217] RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge
Llms

[2602.22217] RAGdb: A Zero-Dependency, Embeddable Architecture for Multimodal Retrieval-Augmented Generation on the Edge

The paper presents RAGdb, a novel architecture for Retrieval-Augmented Generation (RAG) that simplifies multimodal data processing by eli...

arXiv - AI · 4 min ·
[2602.23111] PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training
Llms

[2602.23111] PRAC: Principal-Random Subspace for LLM Activation Compression and Memory-Efficient Training

The paper presents PRAC, a novel method for compressing activations in large language models, achieving significant memory savings while ...

arXiv - Machine Learning · 3 min ·
[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks
Llms

[2602.23330] Toward Expert Investment Teams:A Multi-Agent LLM System with Fine-Grained Trading Tasks

This article presents a multi-agent LLM framework for financial trading, emphasizing fine-grained task decomposition to enhance decision-...

arXiv - AI · 4 min ·
[2602.23315] Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction
Machine Learning

[2602.23315] Invariant Transformation and Resampling based Epistemic-Uncertainty Reduction

This article presents a novel approach to reducing epistemic uncertainty in AI models through invariant transformation and resampling tec...

arXiv - AI · 3 min ·
[2602.23035] Learning Disease-Sensitive Latent Interaction Graphs From Noisy Cardiac Flow Measurements
Machine Learning

[2602.23035] Learning Disease-Sensitive Latent Interaction Graphs From Noisy Cardiac Flow Measurements

This paper presents a novel framework for modeling cardiac blood flow patterns using disease-sensitive latent interaction graphs, enhanci...

arXiv - Machine Learning · 4 min ·
[2602.23271] Evaluating Stochasticity in Deep Research Agents
Ai Infrastructure

[2602.23271] Evaluating Stochasticity in Deep Research Agents

This paper evaluates the stochasticity in Deep Research Agents (DRAs), highlighting how variability in their outputs can impact research ...

arXiv - AI · 4 min ·
[2602.23248] Mitigating Legibility Tax with Decoupled Prover-Verifier Games
Llms

[2602.23248] Mitigating Legibility Tax with Decoupled Prover-Verifier Games

This paper presents a novel approach to mitigate the 'legibility tax' in large language models by decoupling the prover-verifier game, al...

arXiv - AI · 3 min ·
[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks
Machine Learning

[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

This paper explores generalization bounds for Stochastic Gradient Descent (SGD) in homogeneous neural networks, revealing that slower ste...

arXiv - Machine Learning · 3 min ·
[2602.22911] NoRA: Breaking the Linear Ceiling of Low-Rank Adaptation via Manifold Expansion
Machine Learning

[2602.22911] NoRA: Breaking the Linear Ceiling of Low-Rank Adaptation via Manifold Expansion

The paper introduces NoRA, a novel approach to Low-Rank Adaptation (LoRA) that overcomes the limitations of linear methods by utilizing m...

arXiv - Machine Learning · 3 min ·
[2602.22882] Fair feature attribution for multi-output prediction: a Shapley-based perspective
Ai Infrastructure

[2602.22882] Fair feature attribution for multi-output prediction: a Shapley-based perspective

This article presents a Shapley-based framework for fair feature attribution in multi-output prediction, addressing the limitations of ex...

arXiv - Machine Learning · 3 min ·
[2602.23193] ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering
Llms

[2602.23193] ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

The paper presents ESAA, an architecture for autonomous agents using event sourcing to enhance state management and execution in LLM-base...

arXiv - AI · 4 min ·
[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence
Llms

[2602.22831] Moral Preferences of LLMs Under Directed Contextual Influence

This paper explores how contextual influences affect the moral decision-making of large language models (LLMs) in scenarios akin to troll...

arXiv - AI · 4 min ·
[2602.22812] Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching
Llms

[2602.22812] Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

The paper presents a method for enhancing the performance of local large language models (LLMs) on resource-constrained edge devices thro...

arXiv - Machine Learning · 3 min ·
[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots
Machine Learning

[2602.22973] Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots

The paper presents a framework for improving AI diagnostic alignment in clinical settings by preserving AI-generated reports as immutable...

arXiv - AI · 4 min ·
[2602.22968] Certified Circuits: Stability Guarantees for Mechanistic Circuits
Machine Learning

[2602.22968] Certified Circuits: Stability Guarantees for Mechanistic Circuits

The paper introduces Certified Circuits, a framework that enhances the stability and accuracy of circuit discovery in neural networks, ad...

arXiv - AI · 3 min ·
[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning
Llms

[2602.22963] FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

FactGuard introduces an innovative framework for detecting video misinformation using reinforcement learning, enhancing the capabilities ...

arXiv - AI · 3 min ·
[2602.22642] Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning
Llms

[2602.22642] Compress the Easy, Explore the Hard: Difficulty-Aware Entropy Regularization for Efficient LLM Reasoning

This paper introduces a novel approach called CEEH, which combines difficulty-aware entropy regularization with reinforcement learning to...

arXiv - Machine Learning · 4 min ·
[2602.22633] Tackling Privacy Heterogeneity in Differentially Private Federated Learning
Machine Learning

[2602.22633] Tackling Privacy Heterogeneity in Differentially Private Federated Learning

This article presents a novel approach to address privacy heterogeneity in differentially private federated learning (DP-FL), proposing a...

arXiv - Machine Learning · 4 min ·
[2602.22822] FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics
Machine Learning

[2602.22822] FlexMS is a flexible framework for benchmarking deep learning-based mass spectrum prediction tools in metabolomics

FlexMS is a new framework designed for benchmarking deep learning models used in mass spectrum prediction within metabolomics, addressing...

arXiv - Machine Learning · 4 min ·
[2602.22611] Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD
Machine Learning

[2602.22611] Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD

This paper presents Layer-wise MIA-risk-aware DP-SGD, a method to reduce Membership Inference Attack risks in machine learning models by ...

arXiv - Machine Learning · 4 min ·
Previous Page 72 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime