AI Infrastructure

GPUs, training clusters, MLOps, and deployment

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Ai Infrastructure

[D] thoughts on the controversy about Google's new paper?

Openreview: https://openreview.net/forum?id=tO3ASKZlok It's sad to see almost no one mention this on Reddit and people are being mean to ...

Reddit - Machine Learning · 1 min · 18 minutes ago

Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min · about 3 hours ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

All Content

Llms

[2603.19375] Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Abstract page for arXiv paper 2603.19375: Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

arXiv - Machine Learning · 3 min · 7 days ago

Ai Infrastructure

[2603.19285] Beam-aware Kernelized Contextual Bandits for User Association and Beamforming in mmWave Vehicular Networks

Abstract page for arXiv paper 2603.19285: Beam-aware Kernelized Contextual Bandits for User Association and Beamforming in mmWave Vehicul...

arXiv - Machine Learning · 3 min · 7 days ago

Ai Infrastructure

[2603.19277] MOSAIC: Modular Opinion Summarization using Aspect Identification and Clustering

Abstract page for arXiv paper 2603.19277: MOSAIC: Modular Opinion Summarization using Aspect Identification and Clustering

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2603.19261] Significance-Gain Pair Encoding for LLMs: A Statistical Alternative to Frequency-Based Subword Merging

Abstract page for arXiv paper 2603.19261: Significance-Gain Pair Encoding for LLMs: A Statistical Alternative to Frequency-Based Subword ...

arXiv - Machine Learning · 3 min · 7 days ago

Nlp

[2603.20037] Federated Hyperdimensional Computing for Resource-Constrained Industrial IoT

Abstract page for arXiv paper 2603.20037: Federated Hyperdimensional Computing for Resource-Constrained Industrial IoT

arXiv - Machine Learning · 3 min · 7 days ago

Ai Infrastructure

[2603.20036] Continual Learning as Shared-Manifold Continuation Under Compatible Shift

Abstract page for arXiv paper 2603.20036: Continual Learning as Shared-Manifold Continuation Under Compatible Shift

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2603.20014] AgenticRS-EnsNAS: Ensemble-Decoupled Self-Evolving Architecture Search

Abstract page for arXiv paper 2603.20014: AgenticRS-EnsNAS: Ensemble-Decoupled Self-Evolving Architecture Search

arXiv - Machine Learning · 4 min · 7 days ago

Nlp

[2603.20009] A Super Fast K-means for Indexing Vector Embeddings

Abstract page for arXiv paper 2603.20009: A Super Fast K-means for Indexing Vector Embeddings

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2603.19864] NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing

Abstract page for arXiv paper 2603.19864: NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2603.19742] Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation

Abstract page for arXiv paper 2603.19742: Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target ...

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2603.19611] Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

Abstract page for arXiv paper 2603.19611: Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

arXiv - Machine Learning · 4 min · 7 days ago

Llms

[2603.19360] Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

Abstract page for arXiv paper 2603.19360: Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

arXiv - Machine Learning · 4 min · 7 days ago

Machine Learning

[2603.19338] DAPA: Distribution Aware Piecewise Activation Functions for On-Device Transformer Inference and Training

Abstract page for arXiv paper 2603.19338: DAPA: Distribution Aware Piecewise Activation Functions for On-Device Transformer Inference and...

arXiv - Machine Learning · 3 min · 7 days ago

Machine Learning

[2603.19331] FalconBC: Flow matching for Amortized inference of Latent-CONditioned physiologic Boundary Conditions

Abstract page for arXiv paper 2603.19331: FalconBC: Flow matching for Amortized inference of Latent-CONditioned physiologic Boundary Cond...

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2603.19296] TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly

Abstract page for arXiv paper 2603.19296: TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly

arXiv - Machine Learning · 3 min · 7 days ago

Llms

[2603.18377] PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

Abstract page for arXiv paper 2603.18377: PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

arXiv - AI · 4 min · 7 days ago

Machine Learning

[2603.18062] S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition

Abstract page for arXiv paper 2603.18062: S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition

arXiv - AI · 4 min · 7 days ago

Llms

[2504.09775] Understanding and Optimizing Multi-Stage AI Inference Pipelines

Abstract page for arXiv paper 2504.09775: Understanding and Optimizing Multi-Stage AI Inference Pipelines

arXiv - Machine Learning · 4 min · 7 days ago

Machine Learning

[2502.19095] Cross-site scripting adversarial attacks based on deep reinforcement learning: Evaluation and extension study

Abstract page for arXiv paper 2502.19095: Cross-site scripting adversarial attacks based on deep reinforcement learning: Evaluation and e...

arXiv - AI · 4 min · 7 days ago

Machine Learning

[1709.09051] Exact MAP inference in general higher-order graphical models using linear programming

Abstract page for arXiv paper 1709.09051: Exact MAP inference in general higher-order graphical models using linear programming

arXiv - AI · 4 min · 7 days ago

Previous Page 23 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

AI Infrastructure

Top This Week

[D] thoughts on the controversy about Google's new paper?

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

UMKC Announces New Master of Science in Artificial Intelligence

All Content

[2603.19375] Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

[2603.19285] Beam-aware Kernelized Contextual Bandits for User Association and Beamforming in mmWave Vehicular Networks

[2603.19277] MOSAIC: Modular Opinion Summarization using Aspect Identification and Clustering

[2603.19261] Significance-Gain Pair Encoding for LLMs: A Statistical Alternative to Frequency-Based Subword Merging

[2603.20037] Federated Hyperdimensional Computing for Resource-Constrained Industrial IoT

[2603.20036] Continual Learning as Shared-Manifold Continuation Under Compatible Shift

[2603.20014] AgenticRS-EnsNAS: Ensemble-Decoupled Self-Evolving Architecture Search

[2603.20009] A Super Fast K-means for Indexing Vector Embeddings

[2603.19864] NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing

[2603.19742] Dual Path Attribution: Efficient Attribution for SwiGLU-Transformers through Layer-Wise Target Propagation

[2603.19611] Demonstrations, CoT, and Prompting: A Theoretical Analysis of ICL

[2603.19360] Warm-Start Flow Matching for Guaranteed Fast Text/Image Generation

[2603.19338] DAPA: Distribution Aware Piecewise Activation Functions for On-Device Transformer Inference and Training

[2603.19331] FalconBC: Flow matching for Amortized inference of Latent-CONditioned physiologic Boundary Conditions

[2603.19296] TTQ: Activation-Aware Test-Time Quantization to Accelerate LLM Inference On The Fly

[2603.18377] PlanTwin: Privacy-Preserving Planning Abstractions for Cloud-Assisted LLM Agents

[2603.18062] S3T-Former: A Purely Spike-Driven State-Space Topology Transformer for Skeleton Action Recognition

[2504.09775] Understanding and Optimizing Multi-Stage AI Inference Pipelines

[2502.19095] Cross-site scripting adversarial attacks based on deep reinforcement learning: Evaluation and extension study

[1709.09051] Exact MAP inference in general higher-order graphical models using linear programming

Related Topics

Stay updated with AI News