AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

Built a demo where an agent can provision 2 GPUs, then gets hard-blocked on the 3rd call

Policy: - budget = 1000 - each `provision_gpu(a100)` call = 500 Result: - call 1 -> ALLOW - call 2 -> ALLOW - call 3 -> DENY (`B...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

We're releasing a paper on a new framework for reading and interpreting the internal cognitive states of large language models: "The Lyra...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

Hi all, I made a small tool that I've been using for my own literature reviews and figured I'd share in case it's useful to anyone else. ...

Reddit - Machine Learning · 1 min · about 3 hours ago

All Content

Machine Learning

[2602.17149] TimeOmni-VL: Unified Models for Time Series Understanding and Generation

TimeOmni-VL introduces a unified framework for time series understanding and generation, overcoming limitations of existing models by int...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17095] FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment

The paper presents FLoRG, a federated fine-tuning framework that utilizes low-rank Gram matrices and Procrustes alignment to enhance the ...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17063] Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression

The paper discusses 'Sign Lock-In,' a phenomenon in machine learning where randomly initialized weight signs persist during model trainin...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.16968] DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

The paper presents DDiT, a novel approach for dynamic patch scheduling in diffusion transformers, enhancing efficiency in image and video...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.16918] Xray-Visual Models: Scaling Vision models on Industry Scale Data

The paper presents Xray-Visual, a novel vision model architecture designed for large-scale image and video understanding, utilizing exten...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

The paper presents Verbalized Action Masking (VAM), a novel method for enhancing exploration in reinforcement learning (RL) post-training...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.16802] References Improve LLM Alignment in Non-Verifiable Domains

This paper explores how reference-guided evaluators can enhance LLM alignment in non-verifiable domains, demonstrating significant improv...

arXiv - Machine Learning · 4 min · about 2 months ago

Machine Learning

[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

The paper presents a deterministic semantic state substrate for AI, demonstrating a novel compute envelope that maintains performance acr...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.16719] GPU-Accelerated Algorithms for Graph Vector Search: Taxonomy, Empirical Study, and Research Directions

This paper presents a comprehensive survey of GPU-accelerated algorithms for graph vector search, detailing optimization strategies and e...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17560] ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment

The paper presents ODESteer, a novel ODE-based framework for aligning large language models (LLMs) by addressing limitations in existing ...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17529] Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation

This article presents a novel framework, KG-RAG, that enhances large language models (LLMs) for telecom applications by integrating dynam...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17508] Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

This article presents a benchmarking framework for optimizing AI models on ARM Cortex processors, focusing on energy efficiency and perfo...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17442] WarpRec: Unifying Academic Rigor and Industrial Scale for Responsible, Reproducible, and Efficient Recommendation

WarpRec presents a high-performance framework for recommender systems, merging academic rigor with industrial scalability, while promotin...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17385] Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature

This paper presents a novel dataless approach to disentangling task vectors in task arithmetic using Kronecker-Factored Approximate Curva...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval

The paper presents a novel framework integrating formal verification with deep learning for improved image retrieval, addressing the limi...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web

The paper introduces 'Web Verbs', a set of typed abstractions designed to improve task composition on the Agentic Web, enhancing reliabil...

arXiv - AI · 4 min · about 2 months ago

Llms

[2602.17288] ArXiv-to-Model: A Practical Study of Scientific LM Training

This article presents a detailed study on training a 1.36B-parameter scientific language model from raw arXiv LaTeX sources, focusing on ...

arXiv - AI · 3 min · about 2 months ago

Llms

[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

This article explores a methodological experiment using AI agents to enhance research in Taiwan's humanities and social sciences, proposi...

arXiv - AI · 4 min · about 2 months ago

Machine Learning

[2602.17189] Texo: Formula Recognition within 20M Parameters

The paper presents Texo, a compact formula recognition model with 20 million parameters, achieving high performance comparable to larger ...

arXiv - AI · 3 min · about 2 months ago

Machine Learning

[2602.17145] Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning

The paper introduces Bonsai, a framework for accelerating Convolutional Neural Networks (CNNs) through criterion-based pruning, demonstra...

arXiv - AI · 3 min · about 2 months ago

Previous Page 121 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

Built a demo where an agent can provision 2 GPUs, then gets hard-blocked on the 3rd call

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

[P] citracer: a small CLI tool to trace where a concept comes from in a citation graph

All Content

[2602.17149] TimeOmni-VL: Unified Models for Time Series Understanding and Generation

[2602.17095] FLoRG: Federated Fine-tuning with Low-rank Gram Matrices and Procrustes Alignment

[2602.17063] Sign Lock-In: Randomly Initialized Weight Signs Persist and Bottleneck Sub-Bit Model Compression

[2602.16968] DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

[2602.16918] Xray-Visual Models: Scaling Vision models on Industry Scale Data

[2602.16833] VAM: Verbalized Action Masking for Controllable Exploration in RL Post-Training -- A Chess Case Study

[2602.16802] References Improve LLM Alignment in Non-Verifiable Domains

[2602.16736] The Compute ICE-AGE: Invariant Compute Envelope under Addressable Graph Evolution

[2602.16719] GPU-Accelerated Algorithms for Graph Vector Search: Taxonomy, Empirical Study, and Research Directions

[2602.17560] ODESteer: A Unified ODE-Based Steering Framework for LLM Alignment

[2602.17529] Enhancing Large Language Models (LLMs) for Telecom using Dynamic Knowledge Graphs and Explainable Retrieval-Augmented Generation

[2602.17508] Pareto Optimal Benchmarking of AI Models on ARM Cortex Processors for Sustainable Embedded Systems

[2602.17442] WarpRec: Unifying Academic Rigor and Industrial Scale for Responsible, Reproducible, and Efficient Recommendation

[2602.17385] Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature

[2602.17386] Visual Model Checking: Graph-Based Inference of Visual Routines for Image Retrieval

[2602.17245] Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web

[2602.17288] ArXiv-to-Model: A Practical Study of Scientific LM Training

[2602.17221] From Labor to Collaboration: A Methodological Experiment Using AI Agents to Augment Research Perspectives in Taiwan's Humanities and Social Sciences

[2602.17189] Texo: Formula Recognition within 20M Parameters

[2602.17145] Bonsai: A Framework for Convolutional Neural Network Acceleration Using Criterion-Based Pruning

Related Topics

Stay updated with AI News