[2602.12681] Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations

[2602.12681] Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations

arXiv - Machine Learning 4 min read Article

Summary

This paper evaluates the robustness of binary code similarity detection models against semantics-preserving transformations, introducing asmFooler to assess model resilience.

Why It Matters

As cybersecurity threats evolve, understanding the limitations of binary code analysis models is crucial. This research highlights vulnerabilities in machine learning approaches, which can inform future developments in secure coding practices and model design.

Key Takeaways

  • Model robustness is influenced by the processing pipeline and feature selection.
  • Adversarial transformations can effectively mislead models with minimal changes.
  • A diverse dataset of binary variants enhances the evaluation of model resilience.

Computer Science > Cryptography and Security arXiv:2602.12681 (cs) [Submitted on 13 Feb 2026] Title:Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations Authors:Jiyong Uhm, Minseok Kim, Michalis Polychronakis, Hyungjoon Koo View a PDF of the paper titled Fool Me If You Can: On the Robustness of Binary Code Similarity Detection Models against Semantics-preserving Transformations, by Jiyong Uhm and 3 other authors View PDF HTML (experimental) Abstract:Binary code analysis plays an essential role in cybersecurity, facilitating reverse engineering to reveal the inner workings of programs in the absence of source code. Traditional approaches, such as static and dynamic analysis, extract valuable insights from stripped binaries, but often demand substantial expertise and manual effort. Recent advances in deep learning have opened promising opportunities to enhance binary analysis by capturing latent features and disclosing underlying code semantics. Despite the growing number of binary analysis models based on machine learning, their robustness to adversarial code transformations at the binary level remains underexplored. We evaluate the robustness of deep learning models for the task of binary code similarity detection (BCSD) under semantics-preserving transformations. The unique nature of machine instructions presents distinct challenges compared to the typical input perturbations found in other domains. ...

Related Articles

A Machine Learning Engineer Thought He Was Safe From AI Layoffs. Then He Got Some Depressing News
Machine Learning

A Machine Learning Engineer Thought He Was Safe From AI Layoffs. Then He Got Some Depressing News

AI News - General · 4 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
When AI training wheels help and hinder learning
Machine Learning

When AI training wheels help and hinder learning

AI News - General · 6 min ·
Sam Altman's Coworkers Say He Can Barely Code and Misunderstands Basic Machine Learning Concepts
Machine Learning

Sam Altman's Coworkers Say He Can Barely Code and Misunderstands Basic Machine Learning Concepts

AI News - General · 2 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime