AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

Built a demo where an agent can provision 2 GPUs, then gets hard-blocked on the 3rd call

Policy: - budget = 1000 - each `provision_gpu(a100)` call = 500 Result: - call 1 -> ALLOW - call 2 -> ALLOW - call 3 -> DENY (`B...

Reddit - Artificial Intelligence · 1 min · 30 minutes ago

Llms

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

We're releasing a paper on a new framework for reading and interpreting the internal cognitive states of large language models: "The Lyra...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

Hi all, I made a small tool that I've been using for my own literature reviews and figured I'd share in case it's useful to anyone else. ...

Reddit - Machine Learning · 1 min · about 2 hours ago

All Content

Llms

[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

The MCIF benchmark introduces a novel framework for evaluating multimodal crosslingual instruction-following capabilities in large langua...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

The paper presents ReplaceMe, a novel method for network simplification that utilizes depth pruning and transformer block linearization, ...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation

This paper presents a knowledge distillation approach for Multi-View 3D reconstruction, utilizing a teacher-student model framework to en...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.06838] An Adaptive Differentially Private Federated Learning Framework with Bi-level Optimization

This paper presents an adaptive differentially private federated learning framework that addresses challenges in model efficiency and sta...

arXiv - AI · 4 min · about 2 months ago

Llms

[2601.15599] Autonomous Business System via Neuro-symbolic AI

The paper presents AUTOBUS, an Autonomous Business System that integrates LLM-based AI agents with predicate-logic programming to enhance...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2507.23497] Sufficient, Necessary and Complete Causal Explanations in Image Classification

This paper explores causal explanations in image classification, demonstrating their formal properties and computability, while introduci...

arXiv - AI · 4 min · about 2 months ago

$[2506.15733] $\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts$

Llms

[2506.15733] $\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts

The paper presents $ exttt{SPECS}$, a novel method for latency-aware test-time scaling in large language models, achieving improved accur...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2410.13957] Goal Inference from Open-Ended Dialog

The paper discusses a method for embodied AI agents to infer user goals from open-ended dialogues using Large Language Models (LLMs), emp...

arXiv - Machine Learning · 4 min · about 2 months ago

Llms

[2602.17664] Sink-Aware Pruning for Diffusion Language Models

The paper presents Sink-Aware Pruning, a novel method for optimizing Diffusion Language Models (DLMs) by identifying and removing unstabl...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17633] When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

The paper discusses the balance between weak and strong verification methods in reasoning with large language models (LLMs), emphasizing ...

arXiv - AI · 3 min · about 2 months ago

Robotics

[2602.17536] Toward a Fully Autonomous, AI-Native Particle Accelerator

This paper outlines a vision for fully autonomous, AI-native particle accelerators, emphasizing AI co-design for optimal performance and ...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.17510] LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights

The paper presents LORA-CRAFT, a novel parameter-efficient fine-tuning method that utilizes Tucker tensor decomposition on pre-trained at...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17452] Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge

Jolt Atlas introduces a zero-knowledge machine learning framework that enhances inference verification through lookup arguments, optimizi...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17450] Beyond Pipelines: A Fundamental Study on the Rise of Generative-Retrieval Architectures in Web Research

This paper explores the evolution of web research through generative-retrieval architectures, highlighting the transformative impact of l...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17431] Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study

This study presents a taxonomy for fine-grained uncertainty quantification in long-form language model outputs, highlighting effective me...

arXiv - Machine Learning · 3 min · about 2 months ago

Machine Learning

[2602.17423] Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

This paper explores the convergence of two-layer neural networks trained with Gaussian masked inputs, demonstrating linear convergence th...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17345] What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

This paper explores vulnerabilities in embodied AI systems, highlighting the inadequacy of existing analyses focused solely on LLMs or CP...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17330] SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

The paper presents SubQuad, an innovative pipeline for analyzing adaptive immune repertoires, addressing challenges of high computational...

arXiv - Machine Learning · 3 min · about 2 months ago

Nlp

[2602.17327] WebFAQ 2.0: A Multilingual QA Dataset with Mined Hard Negatives for Dense Retrieval

WebFAQ 2.0 introduces a multilingual QA dataset with 198 million FAQ-based question-answer pairs across 108 languages, enhancing multilin...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17176] Universal Fine-Grained Symmetry Inference and Enforcement for Rigorous Crystal Structure Prediction

This paper presents a novel approach to crystal structure prediction by utilizing large language models for fine-grained symmetry inferen...

arXiv - AI · 4 min · about 2 months ago

Previous Page 120 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

Built a demo where an agent can provision 2 GPUs, then gets hard-blocked on the 3rd call

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

All Content

[2507.19634] MCIF: Multimodal Crosslingual Instruction-Following Benchmark from Scientific Talks

[2505.02819] ReplaceMe: Network Simplification via Depth Pruning and Transformer Block Linearization

[2412.02039] Multi-View 3D Reconstruction using Knowledge Distillation

[2602.06838] An Adaptive Differentially Private Federated Learning Framework with Bi-level Optimization

[2601.15599] Autonomous Business System via Neuro-symbolic AI

[2507.23497] Sufficient, Necessary and Complete Causal Explanations in Image Classification

[2506.15733] $\texttt{SPECS}$: Faster Test-Time Scaling through Speculative Drafts

[2410.13957] Goal Inference from Open-Ended Dialog

[2602.17664] Sink-Aware Pruning for Diffusion Language Models

[2602.17633] When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

[2602.17536] Toward a Fully Autonomous, AI-Native Particle Accelerator

[2602.17510] LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights

[2602.17452] Jolt Atlas: Verifiable Inference via Lookup Arguments in Zero Knowledge

[2602.17450] Beyond Pipelines: A Fundamental Study on the Rise of Generative-Retrieval Architectures in Web Research

[2602.17431] Fine-Grained Uncertainty Quantification for Long-Form Language Model Outputs: A Comparative Study

[2602.17423] Convergence Analysis of Two-Layer Neural Networks under Gaussian Input Masking

[2602.17345] What Breaks Embodied AI Security:LLM Vulnerabilities, CPS Flaws,or Something Else?

[2602.17330] SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework

[2602.17327] WebFAQ 2.0: A Multilingual QA Dataset with Mined Hard Negatives for Dense Retrieval

[2602.17176] Universal Fine-Grained Symmetry Inference and Enforcement for Rigorous Crystal Structure Prediction

Related Topics

Stay updated with AI News