AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
Ai Infrastructure

[P] GPU friendly lossless 12-bit BF16 format with 0.03% escape rate and 1 integer ADD decode works for AMD & NVIDIA

Hi everyone : ) I just released a new research prototype It’s a lossless BF16 compression format that stores weights in 12 bits by replac...

Reddit - Machine Learning · 1 min ·

All Content

[2602.23312] Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction
Llms

[2602.23312] Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

This paper evaluates the effectiveness of small language models (SLMs) in leader-follower interactions, comparing zero-shot and one-shot ...

arXiv - Machine Learning · 4 min ·
[2602.23197] Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models
Llms

[2602.23197] Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

This paper explores the impact of fine-tuning on in-context learning in linear attention models, revealing conditions that can enhance or...

arXiv - Machine Learning · 3 min ·
[2602.23167] SettleFL: Trustless and Scalable Reward Settlement Protocol for Federated Learning on Permissionless Blockchains (Extended version)
Machine Learning

[2602.23167] SettleFL: Trustless and Scalable Reward Settlement Protocol for Federated Learning on Permissionless Blockchains (Extended version)

SettleFL introduces a scalable and trustless reward settlement protocol for federated learning on permissionless blockchains, addressing ...

arXiv - Machine Learning · 4 min ·
[2602.22760] Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study
Llms

[2602.22760] Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study

This study explores the feasibility of pretraining large language models (LLMs) during renewable energy curtailment periods, aiming to re...

arXiv - AI · 3 min ·
[2602.23079] Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent
Llms

[2602.23079] Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

This article introduces a novel LLM agent designed to assess and mitigate deanonymization risks in textual data using a method called SAL...

arXiv - Machine Learning · 3 min ·
[2602.22752] Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction
Llms

[2602.22752] Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction

This article presents a study on the operational validity of using Large Language Models (LLMs) to simulate social media user behavior th...

arXiv - AI · 4 min ·
[2602.22724] AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification
Llms

[2602.22724] AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

AgentSentry introduces a novel framework to mitigate indirect prompt injection (IPI) in LLM agents, enhancing their security while mainta...

arXiv - AI · 4 min ·
[2602.22700] IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation
Llms

[2602.22700] IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation

The paper presents IMMACULATE, a framework for auditing large language models (LLMs) using verifiable computation to detect economic devi...

arXiv - AI · 3 min ·
[2602.22925] Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks
Machine Learning

[2602.22925] Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks

This paper explores the behavior of wide Bayesian neural networks, focusing on rare fluctuations that influence posterior concentration b...

arXiv - Machine Learning · 3 min ·
[2602.22895] SPD Learn: A Geometric Deep Learning Python Library for Neural Decoding Through Trivialization
Machine Learning

[2602.22895] SPD Learn: A Geometric Deep Learning Python Library for Neural Decoding Through Trivialization

SPD Learn is a new Python library designed for geometric deep learning, specifically for neural decoding using symmetric positive definit...

arXiv - Machine Learning · 3 min ·
[2602.22884] Unsupervised Continual Learning for Amortized Bayesian Inference
Machine Learning

[2602.22884] Unsupervised Continual Learning for Amortized Bayesian Inference

This article presents a novel framework for Unsupervised Continual Learning in Amortized Bayesian Inference, addressing performance issue...

arXiv - Machine Learning · 3 min ·
[2602.22596] BetterScene: 3D Scene Synthesis with Representation-Aligned Generative Model
Machine Learning

[2602.22596] BetterScene: 3D Scene Synthesis with Representation-Aligned Generative Model

BetterScene introduces an innovative approach to 3D scene synthesis, enhancing novel view synthesis quality using sparse photos and a rep...

arXiv - AI · 4 min ·
[2602.22732] Generative Recommendation for Large-Scale Advertising
Llms

[2602.22732] Generative Recommendation for Large-Scale Advertising

This paper introduces GR4AD, a generative recommendation system designed for large-scale advertising, enhancing ad revenue through innova...

arXiv - Machine Learning · 4 min ·
[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule
Machine Learning

[2602.22699] DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule

DPSQL+ is a new SQL library designed to enhance data privacy by enforcing differential privacy and a minimum frequency rule, ensuring sen...

arXiv - Machine Learning · 4 min ·
[2602.22647] Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators
Llms

[2602.22647] Vectorizing the Trie: Efficient Constrained Decoding for LLM-based Generative Retrieval on Accelerators

The paper presents STATIC, a novel approach for efficient constrained decoding in LLM-based generative retrieval, significantly enhancing...

arXiv - Machine Learning · 4 min ·
[2602.22543] Ruyi2 Technical Report
Llms

[2602.22543] Ruyi2 Technical Report

The Ruyi2 Technical Report presents advancements in adaptive computing strategies for Large Language Models (LLMs), focusing on efficienc...

arXiv - AI · 3 min ·
[2602.22547] Towards Dynamic Dense Retrieval with Routing Strategy
Machine Learning

[2602.22547] Towards Dynamic Dense Retrieval with Routing Strategy

The paper presents a novel approach to dense retrieval called Dynamic Dense Retrieval (DDR), which addresses limitations in adapting mode...

arXiv - Machine Learning · 4 min ·
[2602.22544] HARU-Net: Hybrid Attention Residual U-Net for Edge-Preserving Denoising in Cone-Beam Computed Tomography
Machine Learning

[2602.22544] HARU-Net: Hybrid Attention Residual U-Net for Edge-Preserving Denoising in Cone-Beam Computed Tomography

HARU-Net introduces a novel deep learning architecture for denoising cone-beam computed tomography (CBCT) images, enhancing edge preserva...

arXiv - Machine Learning · 4 min ·
[2602.22492] From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference
Machine Learning

[2602.22492] From Shallow Bayesian Neural Networks to Gaussian Processes: General Convergence, Identifiability and Scalable Inference

This paper explores the convergence of shallow Bayesian neural networks to Gaussian processes, focusing on statistical modeling, identifi...

arXiv - Machine Learning · 3 min ·
[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints
Machine Learning

[2602.22488] Explainability-Aware Evaluation of Transfer Learning Models for IoT DDoS Detection Under Resource Constraints

This article evaluates transfer learning models for IoT DDoS detection, focusing on explainability and resource constraints. It analyzes ...

arXiv - AI · 3 min ·
Previous Page 70 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime