[2602.22611] Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD
Summary
This paper presents Layer-wise MIA-risk-aware DP-SGD, a method to reduce Membership Inference Attack risks in machine learning models by adapting privacy protection across layers based on their vulnerability.
Why It Matters
As machine learning models are increasingly used in sensitive applications, protecting against Membership Inference Attacks (MIAs) is crucial. This research introduces a novel approach that enhances privacy without sacrificing model utility, addressing a significant gap in existing methods.
Key Takeaways
- Introduces LM-DP-SGD, which allocates privacy protection based on layer-specific MIA risks.
- Demonstrates improved privacy-utility trade-off compared to traditional DP-SGD methods.
- Establishes theoretical guarantees for both privacy and convergence of the proposed method.
- Utilizes shadow models to estimate MIA risks effectively.
- Extensive experiments validate the effectiveness of LM-DP-SGD in reducing MIA risks.
Computer Science > Machine Learning arXiv:2602.22611 (cs) [Submitted on 26 Feb 2026] Title:Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD Authors:Jiayang Meng, Tao Huang, Chen Hou, Guolong Zheng, Hong Chen View a PDF of the paper titled Mitigating Membership Inference in Intermediate Representations via Layer-wise MIA-risk-aware DP-SGD, by Jiayang Meng and 4 other authors View PDF HTML (experimental) Abstract:In Embedding-as-an-Interface (EaaI) settings, pre-trained models are queried for Intermediate Representations (IRs). The distributional properties of IRs can leak training-set membership signals, enabling Membership Inference Attacks (MIAs) whose strength varies across layers. Although Differentially Private Stochastic Gradient Descent (DP-SGD) mitigates such leakage, existing implementations employ per-example gradient clipping and a uniform, layer-agnostic noise multiplier, ignoring heterogeneous layer-wise MIA vulnerability. This paper introduces Layer-wise MIA-risk-aware DP-SGD (LM-DP-SGD), which adaptively allocates privacy protection across layers in proportion to their MIA risk. Specifically, LM-DP-SGD trains a shadow model on a public shadow dataset, extracts per-layer IRs from its train/test splits, and fits layer-specific MIA adversaries, using their attack error rates as MIA-risk estimates. Leveraging the cross-dataset transferability of MIAs, these estimates are then used to reweight each layer's contri...