Machine Learning

ML algorithms, training, and inference

This Week's Best | Monthly Best | Guide | Trending

Top This Week

Machine Learning

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

I built a fused MoE dispatch kernel in pure Triton that handles the full forward pass for Mixture-of-Experts models. No CUDA, no vendor-s...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[D] ICML Rebuttal Question

I am currently working on my response on the rebuttal acknowledgments for ICML and I doubting how to handle the strawman argument of that...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

[D] ML researcher looking to switch to a product company.

Hey, I am an AI researcher currently working in a deep tech company as a data scientist. Prior to this, I was doing my PhD. My current ro...

Reddit - Machine Learning · 1 min · about 5 hours ago

All Content

Machine Learning

[2603.17112] Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

Abstract page for arXiv paper 2603.17112: Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2603.18596] Elastic Weight Consolidation Done Right for Continual Learning

Abstract page for arXiv paper 2603.18596: Elastic Weight Consolidation Done Right for Continual Learning

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2603.14824] Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

Abstract page for arXiv paper 2603.14824: Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

arXiv - AI · 3 min · 12 days ago

Machine Learning

[2603.15033] Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Abstract page for arXiv paper 2603.15033: Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

arXiv - Machine Learning · 4 min · 12 days ago

Machine Learning

[2603.07990] MJ1: Multimodal Judgment via Grounded Verification

Abstract page for arXiv paper 2603.07990: MJ1: Multimodal Judgment via Grounded Verification

arXiv - Machine Learning · 3 min · 12 days ago

Llms

[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

Abstract page for arXiv paper 2601.12138: DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

arXiv - AI · 3 min · 12 days ago

Llms

[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

Abstract page for arXiv paper 2511.22076: Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in ...

arXiv - AI · 4 min · 12 days ago

$[2511.07719] Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources$

Machine Learning

[2511.07719] Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

Abstract page for arXiv paper 2511.07719: Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2602.01976] FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

Abstract page for arXiv paper 2602.01976: FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Co...

arXiv - AI · 4 min · 12 days ago

Llms

[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

Abstract page for arXiv paper 2601.18858: Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer...

arXiv - AI · 4 min · 12 days ago

Llms

[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

Abstract page for arXiv paper 2510.05318: BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynami...

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2601.14026] Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

Abstract page for arXiv paper 2601.14026: Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

arXiv - Machine Learning · 3 min · 12 days ago

Llms

[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

Abstract page for arXiv paper 2510.00415: Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration und...

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2601.13698] Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation

Abstract page for arXiv paper 2601.13698: Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Es...

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

Abstract page for arXiv paper 2601.09220: From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

arXiv - Machine Learning · 3 min · 12 days ago

Machine Learning

[2410.22492] RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

Abstract page for arXiv paper 2410.22492: RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2302.10426] An Accurate and Interpretable Framework for Trustworthy Process Monitoring

Abstract page for arXiv paper 2302.10426: An Accurate and Interpretable Framework for Trustworthy Process Monitoring

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2601.09166] DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix

Abstract page for arXiv paper 2601.09166: DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher ...

arXiv - Machine Learning · 4 min · 12 days ago

Llms

[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

Abstract page for arXiv paper 2603.23501: MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

arXiv - AI · 4 min · 12 days ago

Machine Learning

[2512.06737] Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)

Abstract page for arXiv paper 2512.06737: Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Awa...

arXiv - AI · 4 min · 12 days ago

Previous Page 110 Next

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Subscribe to Newsletter

Daily or weekly digest • Unsubscribe anytime

Machine Learning

Top This Week

[P] Fused MoE Dispatch in Pure Triton: Beating CUDA-Optimized Megablocks at Inference Batch Sizes

[D] ICML Rebuttal Question

[D] ML researcher looking to switch to a product company.

All Content

[2603.17112] Cascade-Aware Multi-Agent Routing: Spatio-Temporal Sidecars and Geometry-Switching

[2603.18596] Elastic Weight Consolidation Done Right for Continual Learning

[2603.14824] Planning as Goal Recognition: Deriving Heuristics from Intention Models -- Extended Version

[2603.15033] Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

[2603.07990] MJ1: Multimodal Judgment via Grounded Verification

[2601.12138] DriveSafe: A Hierarchical Risk Taxonomy for Safety-Critical LLM-Based Driving Assistants

[2511.22076] Hybrid Stackelberg Game and Diffusion-based Auction for Two-tier Agentic AI Task Offloading in Internet of Agents

[2511.07719] Operational machine learning for remote spectroscopic detection of CH$_{4}$ point sources

[2602.01976] FlyPrompt: Brain-Inspired Random-Expanded Routing with Temporal-Ensemble Experts for General Continual Learning

[2601.18858] Representational Homomorphism Predicts and Improves Compositional Generalization In Transformer Language Model

[2510.05318] BIRD-INTERACT: Re-imagining Text-to-SQL Evaluation for Large Language Models via Lens of Dynamic Interactions

[2601.14026] Universal Approximation Theorem for Input-Connected Multilayer Perceptrons

[2510.00415] Towards Self-Evolving Benchmarks: Synthesizing Agent Trajectories via Test-Time Exploration under Validate-by-Reproduce Paradigm

[2601.13698] Does Privacy Always Harm Fairness? Data-Dependent Trade-offs via Chernoff Information Neural Estimation

[2601.09220] From Hawkes Processes to Attention: Time-Modulated Mechanisms for Event Sequences

[2410.22492] RealCQA-V2: A Diagnostic Benchmark for Structured Visual Entailment over Scientific Charts

[2302.10426] An Accurate and Interpretable Framework for Trustworthy Process Monitoring

[2601.09166] DP-FedSOFIM: Differentially Private Federated Stochastic Optimization using Regularized Fisher Information Matrix

[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage

[2512.06737] Arc Gradient Descent: A Geometrically Motivated Gradient Descent-based Optimiser with Phase-Aware, User-Controlled Step Dynamics (proof-of-concept)

Related Topics

Stay updated with AI News