Machine Learning

ML algorithms, training, and inference

Top This Week

Machine Learning

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

I built a fused MoE dispatch kernel in pure Triton that handles the full forward pass for Mixture-of-Experts models. No CUDA, no vendor-s...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] ICML Rebuttal Question

I am currently working on my response on the rebuttal acknowledgments for ICML and I doubting how to handle the strawman argument of that...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] ML researcher looking to switch to a product company.

Hey, I am an AI researcher currently working in a deep tech company as a data scientist. Prior to this, I was doing my PhD. My current ro...

Reddit - Machine Learning · 1 min ·

All Content

[2603.17112] Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching
Machine Learning

[2603.17112] Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

Abstract page for arXiv paper 2603.17112: Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

arXiv - AI · 4 min ·
[2603.18596] Elastic Weight Consolidation Done Right for Continual Learning
Machine Learning

[2603.18596] Elastic Weight Consolidation Done Right for Continual Learning

Abstract page for arXiv paper 2603.18596: Elastic Weight Consolidation Done Right for Continual Learning

arXiv - AI · 4 min ·
[2603.14824] Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version
Machine Learning

[2603.14824] Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

Abstract page for arXiv paper 2603.14824: Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

arXiv - AI · 3 min ·
[2603.15033] Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion
Machine Learning

[2603.15033] Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Abstract page for arXiv paper 2603.15033: Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

arXiv - Machine Learning · 4 min ·
[2603.07990] MJ1: Multimodal Judgment via Grounded Verification
Machine Learning

[2603.07990] MJ1: Multimodal Judgment via Grounded Verification

Abstract page for arXiv paper 2603.07990: MJ1: Multimodal Judgment via Grounded Verification

arXiv - Machine Learning · 3 min ·
[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants
Llms

[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

Abstract page for arXiv paper 2601.12138: DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

arXiv - AI · 3 min ·
[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents
Llms

[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

Abstract page for arXiv paper 2511.22076: Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in ...

arXiv - AI · 4 min ·
[2511.07719] Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources
Machine Learning

[2511.07719] Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

Abstract page for arXiv paper 2511.07719: Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

arXiv - AI · 4 min ·
[2602.01976] FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning
Machine Learning

[2602.01976] FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

Abstract page for arXiv paper 2602.01976: FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Co...

arXiv - AI · 4 min ·
[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model
Llms

[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

Abstract page for arXiv paper 2601.18858: Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer...

arXiv - AI · 4 min ·
[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions
Llms

[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

Abstract page for arXiv paper 2510.05318: BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynami...

arXiv - AI · 4 min ·
[2601.14026] Universal Approximation Theorem for Input-Connected Multilayer Perceptrons
Machine Learning

[2601.14026] Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

Abstract page for arXiv paper 2601.14026: Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

arXiv - Machine Learning · 3 min ·
[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm
Llms

[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

Abstract page for arXiv paper 2510.00415: Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration und...

arXiv - AI · 4 min ·
[2601.13698] Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation
Machine Learning

[2601.13698] Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation

Abstract page for arXiv paper 2601.13698: Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Es...

arXiv - AI · 4 min ·
[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences
Machine Learning

[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

Abstract page for arXiv paper 2601.09220: From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

arXiv - Machine Learning · 3 min ·
[2410.22492] RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts
Machine Learning

[2410.22492] RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

Abstract page for arXiv paper 2410.22492: RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

arXiv - AI · 4 min ·
[2302.10426] An Accurate and Interpretable Framework for Trustworthy Process Monitoring
Machine Learning

[2302.10426] An Accurate and Interpretable Framework for Trustworthy Process Monitoring

Abstract page for arXiv paper 2302.10426: An Accurate and Interpretable Framework for Trustworthy Process Monitoring

arXiv - AI · 4 min ·
[2601.09166] DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix
Machine Learning

[2601.09166] DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix

Abstract page for arXiv paper 2601.09166: DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher ...

arXiv - Machine Learning · 4 min ·
[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage
Llms

[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

Abstract page for arXiv paper 2603.23501: MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

arXiv - AI · 4 min ·
[2512.06737] Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)
Machine Learning

[2512.06737] Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)

Abstract page for arXiv paper 2512.06737: Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Awa...

arXiv - AI · 4 min ·
Previous Page 110 Next

Related Topics

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime