AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

Compile English function descriptions into 22 MB neural programs that run locally [P]

We built a system, ProgramAsWeights (PAW), where a neural compiler takes a plain-English function description and produces a "neural prog...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

ML/AI Engineer laid off from big tech, need your help!

I recently left a very toxic company that was taking a serious toll on my mental and physical health. I gave everything I had and it cost...

Reddit - ML Jobs · 1 min ·

All Content

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency
Nlp

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

The paper presents Retreever, a tree-based hierarchical retrieval method that enhances efficiency and transparency in information retriev...

arXiv - Machine Learning · 4 min ·
[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
Machine Learning

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

This paper explores compressible dynamics in deep overparameterized low-rank learning, presenting methods to enhance training efficiency ...

arXiv - Machine Learning · 4 min ·
[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference
Machine Learning

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

The paper introduces Cross-Attention Token Pruning (CATP), a method designed to enhance the accuracy of multimodal models by effectively ...

arXiv - AI · 3 min ·
[2601.22977] Quantifying Model Uniqueness in Heterogeneous AI Ecosystems
Llms

[2601.22977] Quantifying Model Uniqueness in Heterogeneous AI Ecosystems

This paper presents a statistical framework for quantifying model uniqueness in heterogeneous AI ecosystems, addressing the challenge of ...

arXiv - AI · 4 min ·
[2601.16909] Preventing the Collapse of Peer Review Requires Verification-First AI
Ai Startups

[2601.16909] Preventing the Collapse of Peer Review Requires Verification-First AI

The paper discusses the need for a verification-first approach in AI-assisted peer review to prevent the collapse of the review process, ...

arXiv - AI · 3 min ·
[2512.07841] Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs
Ai Infrastructure

[2512.07841] Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs

This article analyzes the performance and cache utilization of Data-Oriented Design (DOD) versus Object-Oriented Design (OOD) in multi-th...

arXiv - AI · 4 min ·
[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models
Llms

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

The paper presents RLIE, a framework that integrates large language models (LLMs) with probabilistic rule learning to enhance rule genera...

arXiv - AI · 4 min ·
[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models
Machine Learning

[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models

The paper introduces Batch-CAM, a training framework for convolutional deep learning models that enhances interpretability by aligning mo...

arXiv - AI · 4 min ·
[2508.07388] Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability
Ai Infrastructure

[2508.07388] Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

The paper presents Invert4TVG, a novel framework for Temporal Video Grounding (TVG) that enhances action understanding through inversion ...

arXiv - AI · 4 min ·
[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models
Llms

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

The paper presents CoPE-VideoLM, a novel approach that utilizes codec primitives to enhance the efficiency of video language models, sign...

arXiv - AI · 4 min ·
[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures
Llms

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

The paper introduces Krites, an asynchronous caching policy for large language models (LLMs) that enhances semantic caching efficiency wh...

arXiv - AI · 4 min ·
[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation
Machine Learning

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

The paper introduces Diverging Flows, a method for detecting extrapolations in conditional generation models, enhancing safety in applica...

arXiv - Machine Learning · 3 min ·
[2602.13035] Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL
Llms

[2602.13035] Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL

This paper introduces Introspective LLM, a hierarchical reinforcement learning framework that optimizes sampling temperature in large lan...

arXiv - Machine Learning · 3 min ·
[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments
Llms

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

This paper presents a strategic framework for governments to decide between buying or building large language models (LLMs) for public se...

arXiv - AI · 4 min ·
[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems
Llms

[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

The RGAlign-Rec framework enhances proactive intent prediction in e-commerce chatbots by aligning latent query reasoning with ranking obj...

arXiv - AI · 4 min ·
[2602.12962] TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design
Llms

[2602.12962] TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design

The paper presents TriGen, a novel NPU architecture designed for accelerating large language models (LLMs) through software-hardware co-d...

arXiv - AI · 4 min ·
[2602.12952] Transporting Task Vectors across Different Architectures without Training
Machine Learning

[2602.12952] Transporting Task Vectors across Different Architectures without Training

The paper introduces 'Theseus,' a novel method for transferring task-specific updates across different model architectures without retrai...

arXiv - Machine Learning · 3 min ·
[2602.12933] Deep-Learning Atlas Registration for Melanoma Brain Metastases: Preserving Pathology While Enabling Cohort-Level Analyses
Ai Infrastructure

[2602.12933] Deep-Learning Atlas Registration for Melanoma Brain Metastases: Preserving Pathology While Enabling Cohort-Level Analyses

This article presents a deep-learning framework for registering melanoma brain metastases (MBM) to a common atlas, enhancing cohort-level...

arXiv - AI · 4 min ·
[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues
Robotics

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

This article examines how the availability of knowledge influences the persuasiveness of generative social agents (GSAs) in physiotherapy...

arXiv - AI · 4 min ·
[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management
Ai Infrastructure

[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management

This article presents CASCA, an open-source microservice-based platform designed to enhance sustainable SLO fulfillment and service manag...

arXiv - AI · 4 min ·
Previous Page 180 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime