Top AI Infrastructure This Month

1

Exclusive | Nvidia Plans New Chip to Speed AI Processing, Shake Up Computing Market

AI News - General · 27 days ago

2

AI is gobbling up the world’s memory chips, sending smartphone prices to record highs, report says

A global shortage of memory chips, driven by AI demand, is causing smartphone prices to soar to record highs, with a predicted 14% increase in 2026.

AI News - General · 27 days ago

3

[2601.22669] Beyond Fixed Rounds: Data-Free Early Stopping for Practical Federated Learning

This paper introduces a data-free early stopping framework for federated learning, enhancing efficiency and privacy by eliminating the need for validation data during training.

arXiv - Machine Learning · 28 days ago

4

[2603.22376] AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

Abstract page for arXiv paper 2603.22376: AI Co-Scientist for Ranking: Discovering Novel Search Ranking Models alongside LLM-based AI Agents with Cloud Computing Access

arXiv - AI · 2 days ago

5

[2510.26905] Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

Abstract page for arXiv paper 2510.26905: Cognition Envelopes for Bounded Decision Making in Autonomous UAS Operations

arXiv - AI · 22 days ago

6

[2602.10195] Versor: A Geometric Sequence Architecture

The paper introduces Versor, a novel geometric sequence architecture that leverages Conformal Geometric Algebra for enhanced performance and interpretability in machine learning tasks.

arXiv - Machine Learning · 28 days ago

7

[D] Edge AI Projects on Jetson Orin – Ideas?

A Reddit user seeks innovative project ideas for deploying AI on NVIDIA Jetson Orin devices, leveraging their experience in machine learning and real-time systems.

Reddit - Machine Learning · 28 days ago

8

NVIDIA stagnant for consumer AI cards... will any company ever compete?

The article discusses NVIDIA's lack of focus on consumer AI GPUs and the implications of its pricing strategy, raising questions about future competition in the market.

Reddit - Artificial Intelligence · 28 days ago

9

[2602.22812] Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

The paper presents a method for enhancing the performance of local large language models (LLMs) on resource-constrained edge devices through distributed prompt caching, significantly reducing infer...

arXiv - Machine Learning · 28 days ago

10

[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

This paper explores generalization bounds for Stochastic Gradient Descent (SGD) in homogeneous neural networks, revealing that slower stepsize decay can enhance optimization under certain conditions.

arXiv - Machine Learning · 28 days ago

11

[2602.22352] GRAU: Generic Reconfigurable Activation Unit Design for Neural Network Hardware Accelerators

The paper presents GRAU, a Generic Reconfigurable Activation Unit designed for neural network hardware accelerators, which significantly reduces hardware costs and enhances efficiency through innov...

arXiv - AI · 28 days ago

12

[2602.22402] Contextual Memory Virtualisation: DAG-Based State Management and Structurally Lossless Trimming for LLM Agents

The paper presents Contextual Memory Virtualisation (CMV), a novel system for managing state in large language models (LLMs) using a Directed Acyclic Graph (DAG) structure to enhance context reuse ...

arXiv - AI · 28 days ago

13

[2602.22895] SPD Learn: A Geometric Deep Learning Python Library for Neural Decoding Through Trivialization

SPD Learn is a new Python library designed for geometric deep learning, specifically for neural decoding using symmetric positive definite matrices, enhancing reproducibility and integration in mac...

arXiv - Machine Learning · 28 days ago

14

[2602.22925] Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks

This paper explores the behavior of wide Bayesian neural networks, focusing on rare fluctuations that influence posterior concentration beyond Gaussian-process limits. It introduces large-deviation...

arXiv - Machine Learning · 28 days ago

15

[2602.22700] IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation

The paper presents IMMACULATE, a framework for auditing large language models (LLMs) using verifiable computation to detect economic deviations without needing trusted hardware.

arXiv - AI · 28 days ago

16

[2602.22724] AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

AgentSentry introduces a novel framework to mitigate indirect prompt injection (IPI) in LLM agents, enhancing their security while maintaining task performance.

arXiv - AI · 28 days ago

17

[2602.22752] Towards Simulating Social Media Users with LLMs: Evaluating the Operational Validity of Conditioned Comment Prediction

This article presents a study on the operational validity of using Large Language Models (LLMs) to simulate social media user behavior through Conditioned Comment Prediction (CCP).

arXiv - AI · 28 days ago

18

[2602.23079] Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

This article introduces a novel LLM agent designed to assess and mitigate deanonymization risks in textual data using a method called SALA, which combines stylometric features with LLM reasoning.

arXiv - Machine Learning · 28 days ago

19

[2602.22760] Distributed LLM Pretraining During Renewable Curtailment Windows: A Feasibility Study

This study explores the feasibility of pretraining large language models (LLMs) during renewable energy curtailment periods, aiming to reduce operational emissions and utilize excess clean energy.

arXiv - AI · 28 days ago

20

[2602.23167] SettleFL: Trustless and Scalable Reward Settlement Protocol for Federated Learning on Permissionless Blockchains (Extended version)

SettleFL introduces a scalable and trustless reward settlement protocol for federated learning on permissionless blockchains, addressing cost and efficiency challenges.

arXiv - Machine Learning · 28 days ago

21

[2602.23197] Fine-Tuning Without Forgetting In-Context Learning: A Theoretical Analysis of Linear Attention Models

This paper explores the impact of fine-tuning on in-context learning in linear attention models, revealing conditions that can enhance or degrade performance on downstream tasks.

arXiv - Machine Learning · 28 days ago

22

[2602.23036] LLMServingSim 2.0: A Unified Simulator for Heterogeneous and Disaggregated LLM Serving Infrastructure

LLMServingSim 2.0 introduces a unified simulator for heterogeneous and disaggregated large language model (LLM) serving infrastructures, enhancing performance analysis and system design.

arXiv - AI · 28 days ago

23

[2602.23057] Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

The paper introduces Affine-Scaled Attention, a novel approach to Transformer attention that enhances flexibility and stability by modifying the normalization process, leading to improved training ...

arXiv - AI · 28 days ago

24

[2506.14261] RL-Obfuscation: Can Language Models Learn to Evade Latent-Space Monitors?

This article explores RL-Obfuscation, a method for training language models to evade latent-space monitors that detect undesirable behaviors, highlighting the vulnerabilities of current monitoring ...

arXiv - Machine Learning · 28 days ago

25

[2507.03772] Skewed Score: A statistical framework to assess autograders

The paper presents a statistical framework for assessing autograders used in evaluating LLM outputs, addressing reliability and bias issues through Bayesian generalized linear models.

arXiv - Machine Learning · 28 days ago

26

[2511.07885] Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

The paper presents a metric called Intelligence per Watt (IPW) to evaluate the efficiency of local AI models compared to centralized cloud systems, highlighting significant improvements in local in...

arXiv - Machine Learning · 28 days ago

27

[2509.26238] Beyond Linear Probes: Dynamic Safety Monitoring for Language Models

This paper presents Truncated Polynomial Classifiers (TPCs) for dynamic safety monitoring in large language models, enhancing efficiency and interpretability in assessing model outputs.

arXiv - Machine Learning · 28 days ago

28

[2602.23334] Bitwise Systolic Array Architecture for Runtime-Reconfigurable Multi-precision Quantized Multiplication on Hardware Accelerators

This paper presents a novel bitwise systolic array architecture designed for runtime-reconfigurable multi-precision quantized multiplication, enhancing performance in neural network accelerators.

arXiv - AI · 28 days ago

29

[2504.13359] Cost-of-Pass: An Economic Framework for Evaluating Language Models

The paper presents an economic framework for evaluating language models by analyzing the tradeoff between performance and inference costs, introducing the concept of cost-of-pass.

arXiv - AI · 28 days ago

30

[2511.05854] Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

Abstract page for arXiv paper 2511.05854: Can a Small Model Learn to Look Before It Leaps? Dynamic Learning and Proactive Correction for Hallucination Detection

arXiv - AI · 22 days ago

Top AI Infrastructure This Month

Stay updated with AI News