[2602.19025] Routing-Aware Explanations for Mixture of Experts Graph Models in Malware Detection

[2602.19025] Routing-Aware Explanations for Mixture of Experts Graph Models in Malware Detection

arXiv - AI 4 min read Article

Summary

This article presents a novel approach to malware detection using Mixture-of-Experts (MoE) graph models, emphasizing routing-aware explanations for improved transparency and accuracy.

Why It Matters

As malware threats evolve, effective detection methods are crucial for cybersecurity. This research enhances the interpretability of machine learning models in malware detection, addressing the need for transparency in AI decision-making processes.

Key Takeaways

  • Introduces a Mixture-of-Experts (MoE) model for malware detection using control flow graphs (CFGs).
  • Demonstrates improved detection accuracy and transparency through routing-aware explanations.
  • Combines multiple neighborhood statistics to enhance node representation in graphs.
  • Evaluates against established GNN baselines, showing superior performance.
  • Highlights the importance of expert-level diversity in AI model decision-making.

Computer Science > Cryptography and Security arXiv:2602.19025 (cs) [Submitted on 22 Feb 2026] Title:Routing-Aware Explanations for Mixture of Experts Graph Models in Malware Detection Authors:Hossein Shokouhinejad, Roozbeh Razavi-Far, Griffin Higgins, Ali.A Ghorbani View a PDF of the paper titled Routing-Aware Explanations for Mixture of Experts Graph Models in Malware Detection, by Hossein Shokouhinejad and 3 other authors View PDF HTML (experimental) Abstract:Mixture-of-Experts (MoE) offers flexible graph reasoning by combining multiple views of a graph through a learned router. We investigate routing-aware explanations for MoE graph models in malware detection using control flow graphs (CFGs). Our architecture builds diversity at two levels. At the node level, each layer computes multiple neighborhood statistics and fuses them with an MLP, guided by a degree reweighting factor rho and a pooling choice lambda in {mean, std, max}, producing distinct node representations that capture complementary structural cues in CFGs. At the readout level, six experts, each tied to a specific (rho, lambda) view, output graph-level logits that the router weights into a final prediction. Post-hoc explanations are generated with edge-level attributions per expert and aggregated using the router gates so the rationale reflects both what each expert highlights and how strongly it is selected. Evaluated against single-expert GNN baselines such as GCN, GIN, and GAT on the same CFG dataset, th...

Related Articles

Machine Learning

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperpara...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Budget Machine Learning Hardware

Looking to get into machine learning and found this video on a piece of hardware for less than £500. Is it really possible to teach auton...

Reddit - Machine Learning · 1 min ·
Machine Learning

Your prompts aren’t the problem — something else is

I keep seeing people focus heavily on prompt optimization. But in practice, a lot of failures I’ve observed don’t come from the prompt it...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime