AI Infrastructure

GPUs, training clusters, MLOps, and deployment

Top This Week

Machine Learning

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

I built a fused MoE dispatch kernel in pure Triton that handles the full forward pass for Mixture-of-Experts models. No CUDA, no vendor-s...

Reddit - Machine Learning · 1 min ·
In Japan, the robot isn't coming for your job; it's filling the one nobody wants | TechCrunch
Robotics

In Japan, the robot isn't coming for your job; it's filling the one nobody wants | TechCrunch

Driven by labor shortages, Japan is pushing physical AI from pilot projects into real-world deployment.

TechCrunch - AI · 9 min ·
Machine Learning

[P] bitnet-edge: Ternary-weight CNNs ({-1,0,+1}) on MNIST and CIFAR-10, deployed to ESP32-S3 with zero multiplications

I built a pipeline that takes ternary-quantized CNNs from PyTorch training all the way to bare-metal inference on an ESP32-S3 microcontro...

Reddit - Machine Learning · 1 min ·

All Content

[2602.21647] Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration
Machine Learning

[2602.21647] Mitigating Structural Noise in Low-Resource S2TT: An Optimized Cascaded Nepali-English Pipeline with Punctuation Restoration

This paper presents an optimized cascaded Nepali-English speech-to-text translation system that mitigates structural noise from ASR, enha...

arXiv - Machine Learning · 4 min ·
[2602.21452] Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound
Machine Learning

[2602.21452] Adversarial Robustness of Deep Learning-Based Thyroid Nodule Segmentation in Ultrasound

This article evaluates the adversarial robustness of deep learning models for thyroid nodule segmentation in ultrasound images, highlight...

arXiv - AI · 4 min ·
[2602.21447] Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG
Machine Learning

[2602.21447] Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG

The paper presents a novel framework, MMA-RAG^T, for enhancing the security of multimodal agentic retrieval-augmented generation systems ...

arXiv - Machine Learning · 4 min ·
[2602.21429] Provably Safe Generative Sampling with Constricting Barrier Functions
Machine Learning

[2602.21429] Provably Safe Generative Sampling with Constricting Barrier Functions

This paper presents a safety filtering framework for generative models, ensuring generated samples meet hard constraints while minimizing...

arXiv - Machine Learning · 4 min ·
[2602.21399] FedVG: Gradient-Guided Aggregation for Enhanced Federated Learning
Machine Learning

[2602.21399] FedVG: Gradient-Guided Aggregation for Enhanced Federated Learning

The paper presents FedVG, a novel gradient-guided aggregation framework for federated learning that enhances model performance by address...

arXiv - Machine Learning · 4 min ·
[2602.21374] Small Language Models for Privacy-Preserving Clinical Information Extraction in Low-Resource Languages
Llms

[2602.21374] Small Language Models for Privacy-Preserving Clinical Information Extraction in Low-Resource Languages

This study explores the use of small language models for extracting clinical information from low-resource languages, focusing on a priva...

arXiv - Machine Learning · 4 min ·
[2602.21379] MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation
Machine Learning

[2602.21379] MrBERT: Modern Multilingual Encoders via Vocabulary, Domain, and Dimensional Adaptation

MrBERT introduces a family of multilingual encoders optimized for various domains, achieving state-of-the-art results in specific tasks w...

arXiv - Machine Learning · 3 min ·
[2602.21368] Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration
Ai Infrastructure

[2602.21368] Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration

This paper presents a method for certifying the reliability of black-box AI systems using self-consistency sampling and conformal calibra...

arXiv - Machine Learning · 3 min ·
[2602.21255] A General Equilibrium Theory of Orchestrated AI Agent Systems
Llms

[2602.21255] A General Equilibrium Theory of Orchestrated AI Agent Systems

This paper presents a general equilibrium theory for orchestrated AI agent systems, modeling large language model (LLM) agents within a p...

arXiv - AI · 4 min ·
[2602.21267] A Systematic Review of Algorithmic Red Teaming Methodologies for Assurance and Security of AI Applications
Ai Safety

[2602.21267] A Systematic Review of Algorithmic Red Teaming Methodologies for Assurance and Security of AI Applications

This systematic review explores automated red teaming methodologies for enhancing the security of AI applications, addressing the limitat...

arXiv - AI · 3 min ·
[2602.21251] AgenticTyper: Automated Typing of Legacy Software Projects Using Agentic AI
Llms

[2602.21251] AgenticTyper: Automated Typing of Legacy Software Projects Using Agentic AI

AgenticTyper is a novel AI-driven tool that automates the typing of legacy JavaScript projects, significantly reducing manual effort and ...

arXiv - AI · 3 min ·
[2602.21233] AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression
Machine Learning

[2602.21233] AngelSlim: A more accessible, comprehensive, and efficient toolkit for large model compression

AngelSlim introduces a versatile toolkit for large model compression, integrating advanced algorithms for efficient deployment and improv...

arXiv - Machine Learning · 4 min ·
[2602.21227] Budget-Aware Agentic Routing via Boundary-Guided Training
Llms

[2602.21227] Budget-Aware Agentic Routing via Boundary-Guided Training

The paper presents Budget-Aware Agentic Routing, a method for optimizing the use of large language models in autonomous agents by balanci...

arXiv - AI · 4 min ·
[2602.21225] Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal
Machine Learning

[2602.21225] Architecture-Agnostic Curriculum Learning for Document Understanding: Empirical Evidence from Text-Only and Multimodal

This paper explores architecture-agnostic curriculum learning for document understanding, demonstrating efficiency gains in training time...

arXiv - Machine Learning · 4 min ·
[2602.21224] Make Every Draft Count: Hidden State based Speculative Decoding
Llms

[2602.21224] Make Every Draft Count: Hidden State based Speculative Decoding

The paper presents a novel approach to speculative decoding in large language models (LLMs), focusing on reusing discarded draft tokens t...

arXiv - Machine Learning · 4 min ·
[2602.21222] Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases
Llms

[2602.21222] Task-Aware LoRA Adapter Composition via Similarity Retrieval in Vector Databases

This paper presents a novel framework for dynamic LoRA adapter composition using similarity retrieval in vector databases, enabling effic...

arXiv - Machine Learning · 4 min ·
[2602.21221] Latent Context Compilation: Distilling Long Context into Compact Portable Memory
Llms

[2602.21221] Latent Context Compilation: Distilling Long Context into Compact Portable Memory

The paper introduces Latent Context Compilation, a novel framework that enhances long-context LLM deployment by distilling long contexts ...

arXiv - Machine Learning · 3 min ·
[2602.21215] Inference-time Alignment via Sparse Junction Steering
Llms

[2602.21215] Inference-time Alignment via Sparse Junction Steering

This paper presents Sparse Inference-time Alignment (SIA), a novel approach to enhance alignment in large language models by intervening ...

arXiv - AI · 4 min ·
[2602.21889] 2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support
Machine Learning

[2602.21889] 2-Step Agent: A Framework for the Interaction of a Decision Maker with AI Decision Support

The paper presents the 2-Step Agent framework, which models the interaction between decision makers and AI decision support systems, high...

arXiv - Machine Learning · 3 min ·
[2602.21746] fEDM+: A Risk-Based Fuzzy Ethical Decision Making Framework with Principle-Level Explainability and Pluralistic Validation
Machine Learning

[2602.21746] fEDM+: A Risk-Based Fuzzy Ethical Decision Making Framework with Principle-Level Explainability and Pluralistic Validation

The paper presents fEDM+, an enhanced fuzzy ethical decision-making framework that improves explainability and validation by integrating ...

arXiv - AI · 4 min ·
Previous Page 79 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime