[2505.03530] A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders

[2505.03530] A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2505.03530: A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders

Computer Science > Machine Learning arXiv:2505.03530 (cs) [Submitted on 6 May 2025 (v1), last revised 5 Apr 2026 (this version, v3)] Title:A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders Authors:Dip Roy, Rajiv Misra, Sanjay Kumar Singh, Anisha Roy View a PDF of the paper titled A Multi-Level Causal Intervention Framework for Mechanistic Interpretability in Variational Autoencoders, by Dip Roy and 3 other authors View PDF Abstract:Understanding how generative models represent and transform data is a foundational problem in deep learning interpretability. While mechanistic interpretability of discriminative architectures has yielded substantial insights, relatively little work has addressed variational autoencoders (VAEs). This paper presents the first general-purpose multilevel causal intervention framework for mechanistic interpretability of VAEs. The framework comprises four manipulation types: input manipulation, latent-space perturbation, activation patching, and causal mediation analysis. We also define three new quantitative metrics capturing properties not measured by existing disentanglement metrics alone: Causal Effect Strength (CES), intervention specificity, and circuit modularity. We conduct the largest empirical study to date of VAE causal mechanisms across six architectures (standard VAE, beta-VAE, FactorVAE, beta-TC-VAE, DIP-VAE-II, and VQ-VAE) and five benchmarks (dSprites, 3DShapes, MPI3D, CelebA, and...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

Llms

Qwen3 4B outperforms cloud agents on code tasks—with Mahoraga research [R]

Hey everyone in ML. I've been working on Mahoraga, an open-source orchestrator that routes tasks across local and cloud AI agents using a...

Reddit - Machine Learning · 1 min ·
Machine Learning

Auroch - The Future of AI Memory

Auroch Engine is an external memory layer for AI assistants — designed to give models better long-term recall, personalization, and conte...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Project Aurelia — A 3-model architecture (80B + 13B + 9B) that physically reacts to my real-time heart rate via mmWave radar, spatial awareness via Lidar, and Vibration via Accelerometer. All on a Framework Desktop + eGPU

Hey everyone, I’ve been building a multi-agent system in my spare time, and I just open-sourced the repository. I was getting tired of th...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

Help needed [D]

Heyy guyss... I had made the image dataset and was currently working on its training using the srnet model... I made it train on batches ...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime