[2602.18250] Variational Distributional Neuron
Summary
The paper introduces the concept of a Variational Distributional Neuron, a compute unit that incorporates uncertainty in its operations, challenging traditional deterministic neuron models.
Why It Matters
This research is significant as it proposes a novel approach to neural computation by integrating probabilistic elements directly into neuron functionality. This could enhance the interpretability and stability of machine learning models, particularly in sequential generation tasks where uncertainty plays a crucial role.
Key Takeaways
- Proposes a compute unit that operates as a distribution rather than a deterministic scalar.
- Addresses the need for explicit representation of uncertainty in neural computations.
- Explores the implications of probabilistic constraints on neural network architecture.
- Suggests that neurons can carry contextual information and temporal persistence through local constraints.
- Challenges existing paradigms by questioning the deterministic nature of traditional neuron models.
Computer Science > Machine Learning arXiv:2602.18250 (cs) [Submitted on 20 Feb 2026] Title:Variational Distributional Neuron Authors:Yves Ruffenach View a PDF of the paper titled Variational Distributional Neuron, by Yves Ruffenach View PDF HTML (experimental) Abstract:We propose a proof of concept for a variational distributional neuron: a compute unit formulated as a VAE brick, explicitly carrying a prior, an amortized posterior and a local ELBO. The unit is no longer a deterministic scalar but a distribution: computing is no longer about propagating values, but about contracting a continuous space of possibilities under constraints. Each neuron parameterizes a posterior, propagates a reparameterized sample and is regularized by the KL term of a local ELBO - hence, the activation is distributional. This "contraction" becomes testable through local constraints and can be monitored via internal measures. The amount of contextual information carried by the unit, as well as the temporal persistence of this information, are locally tuned by distinct constraints. This proposal addresses a structural tension: in sequential generation, causality is predominantly organized in the symbolic space and, even when latents exist, they often remain auxiliary, while the effective dynamics are carried by a largely deterministic decoder. In parallel, probabilistic latent models capture factors of variation and uncertainty, but that uncertainty typically remains borne by global or parametri...