AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

Compile English function descriptions into 22 MB neural programs that run locally [P]

We built a system, ProgramAsWeights (PAW), where a neural compiler takes a plain-English function description and produces a "neural prog...

Reddit - Machine Learning · 1 min · 15 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

ML/AI Engineer laid off from big tech, need your help!

I recently left a very toxic company that was taking a serious toll on my mental and physical health. I gave everything I had and it cost...

Reddit - ML Jobs · 1 min · about 8 hours ago

All Content

Nlp

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

The paper presents Retreever, a tree-based hierarchical retrieval method that enhances efficiency and transparency in information retriev...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

This paper explores compressible dynamics in deep overparameterized low-rank learning, presenting methods to enhance training efficiency ...

arXiv - Machine Learning · 4 min · 2 months ago

Machine Learning

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

The paper introduces Cross-Attention Token Pruning (CATP), a method designed to enhance the accuracy of multimodal models by effectively ...

arXiv - AI · 3 min · 2 months ago

Llms

[2601.22977] Quantifying Model Uniqueness in Heterogeneous AI Ecosystems

This paper presents a statistical framework for quantifying model uniqueness in heterogeneous AI ecosystems, addressing the challenge of ...

arXiv - AI · 4 min · 2 months ago

Ai Startups

[2601.16909] Preventing the Collapse of Peer Review Requires Verification-First AI

The paper discusses the need for a verification-first approach in AI-assisted peer review to prevent the collapse of the review process, ...

arXiv - AI · 3 min · 2 months ago

Ai Infrastructure

[2512.07841] Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs

This article analyzes the performance and cache utilization of Data-Oriented Design (DOD) versus Object-Oriented Design (OOD) in multi-th...

arXiv - AI · 4 min · 2 months ago

Llms

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

The paper presents RLIE, a framework that integrates large language models (LLMs) with probabilistic rule learning to enhance rule genera...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models

The paper introduces Batch-CAM, a training framework for convolutional deep learning models that enhances interpretability by aligning mo...

arXiv - AI · 4 min · 2 months ago

Ai Infrastructure

[2508.07388] Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

The paper presents Invert4TVG, a novel framework for Temporal Video Grounding (TVG) that enhances action understanding through inversion ...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

The paper presents CoPE-VideoLM, a novel approach that utilizes codec primitives to enhance the efficiency of video language models, sign...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

The paper introduces Krites, an asynchronous caching policy for large language models (LLMs) that enhances semantic caching efficiency wh...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

The paper introduces Diverging Flows, a method for detecting extrapolations in conditional generation models, enhancing safety in applica...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.13035] Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL

This paper introduces Introspective LLM, a hierarchical reinforcement learning framework that optimizes sampling temperature in large lan...

arXiv - Machine Learning · 3 min · 2 months ago

Llms

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

This paper presents a strategic framework for governments to decide between buying or building large language models (LLMs) for public se...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

The RGAlign-Rec framework enhances proactive intent prediction in e-commerce chatbots by aligning latent query reasoning with ranking obj...

arXiv - AI · 4 min · 2 months ago

Llms

[2602.12962] TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design

The paper presents TriGen, a novel NPU architecture designed for accelerating large language models (LLMs) through software-hardware co-d...

arXiv - AI · 4 min · 2 months ago

Machine Learning

[2602.12952] Transporting Task Vectors across Different Architectures without Training

The paper introduces 'Theseus,' a novel method for transferring task-specific updates across different model architectures without retrai...

arXiv - Machine Learning · 3 min · 2 months ago

Ai Infrastructure

[2602.12933] Deep-Learning Atlas Registration for Melanoma Brain Metastases: Preserving Pathology While Enabling Cohort-Level Analyses

This article presents a deep-learning framework for registering melanoma brain metastases (MBM) to a common atlas, enhancing cohort-level...

arXiv - AI · 4 min · 2 months ago

Robotics

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

This article examines how the availability of knowledge influences the persuasiveness of generative social agents (GSAs) in physiotherapy...

arXiv - AI · 4 min · 2 months ago

Ai Infrastructure

[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management

This article presents CASCA, an open-source microservice-based platform designed to enhance sustainable SLO fulfillment and service manag...

arXiv - AI · 4 min · 2 months ago

Previous Page 180 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

Compile English function descriptions into 22 MB neural programs that run locally [P]

UMKC Announces New Master of Science in Artificial Intelligence

ML/AI Engineer laid off from big tech, need your help!

All Content

[2502.07971] Hierarchical Retrieval at Scale: Bridging Transparency and Efficiency

[2406.04112] Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

[2404.08567] CATP: Cross-Attention Token Pruning for Accuracy Preserved Multimodal Model Inference

[2601.22977] Quantifying Model Uniqueness in Heterogeneous AI Ecosystems

[2601.16909] Preventing the Collapse of Peer Review Requires Verification-First AI

[2512.07841] Impact of Data-Oriented and Object-Oriented Design on Performance and Cache Utilization with Artificial Intelligence Algorithms in Multi-Threaded CPUs

[2510.19698] RLIE: Rule Generation with Logistic Regression, Iterative Refinement, and Evaluation for Large Language Models

[2510.00664] Batch-CAM: Introduction to better reasoning in convolutional deep learning models

[2508.07388] Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

[2602.13191] CoPE-VideoLM: Codec Primitives For Efficient Video Language Models

[2602.13165] Asynchronous Verified Semantic Caching for Tiered LLM Architectures

[2602.13061] Diverging Flows: Detecting Extrapolations in Conditional Generation

[2602.13035] Look Inward to Explore Outward: Learning Temperature Policy from LLM Internal States via Hierarchical RL

[2602.13033] Buy versus Build an LLM: A Decision Framework for Governments

[2602.12968] RGAlign-Rec: Ranking-Guided Alignment for Latent Query Reasoning in Recommendation Systems

[2602.12962] TriGen: NPU Architecture for End-to-End Acceleration of Large Language Models based on SW-HW Co-Design

[2602.12952] Transporting Task Vectors across Different Architectures without Training

[2602.12933] Deep-Learning Atlas Registration for Melanoma Brain Metastases: Preserving Pathology While Enabling Cohort-Level Analyses

[2602.12924] Never say never: Exploring the effects of available knowledge on agent persuasiveness in controlled physiotherapy motivation dialogues

[2602.12875] A Microservice-Based Platform for Sustainable and Intelligent SLO Fulfilment and Service Management

Related Topics

Stay updated with AI News