[2604.00419] G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

[2604.00419] G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2604.00419: G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Computer Science > Machine Learning arXiv:2604.00419 (cs) [Submitted on 1 Apr 2026] Title:G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs Authors:Ravi Ranjan, Utkarsh Grover, Xiaomin Lin, Agoritsa Polyzou View a PDF of the paper titled G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs, by Ravi Ranjan and 3 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) are trained on massive web-scale corpora, raising growing concerns about privacy and copyright. Membership inference attacks (MIAs) aim to determine whether a given example was used during training. Existing LLM MIAs largely rely on output probabilities or loss values and often perform only marginally better than random guessing when members and non-members are drawn from the same distribution. We introduce G-Drift MIA, a white-box membership inference method based on gradient-induced feature drift. Given a candidate (x,y), we apply a single targeted gradient-ascent step that increases its loss and measure the resulting changes in internal representations, including logits, hidden-layer activations, and projections onto fixed feature directions, before and after the update. These drift signals are used to train a lightweight logistic classifier that effectively separates members from non-members. Across multiple transformer-based LLMs and datasets derived from realistic MIA benchmarks, G-Drift substantially outperforms confidence-based,...

Originally published on April 02, 2026. Curated by AI News.

Related Articles

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED
Llms

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything | WIRED

The AI lab's Project Glasswing will bring together Apple, Google, and more than 45 other organizations. They'll use the new Claude Mythos...

Wired - AI · 7 min ·
Llms

The public needs to control AI-run infrastructure, labor, education, and governance— NOT private actors

A lot of discussion around AI is becoming siloed, and I think that is dangerous. People in AI-focused spaces often talk as if the only qu...

Reddit - Artificial Intelligence · 1 min ·
Llms

Agents that write their own code at runtime and vote on capabilities, no human in the loop

hollowOS just hit v4.4 and I added something that I haven’t seen anyone else do. Previous versions gave you an OS for agents: structured ...

Reddit - Artificial Intelligence · 1 min ·
Google Maps can now write captions for your photos using AI | TechCrunch
Llms

Google Maps can now write captions for your photos using AI | TechCrunch

Gemini can now create captions when users are looking to share a photo or video.

TechCrunch - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime