[2502.01247] Polynomial, trigonometric, and tropical activations

arXiv - Machine Learning March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2502.01247: Polynomial, trigonometric, and tropical activations

Computer Science > Machine Learning arXiv:2502.01247 (cs) [Submitted on 3 Feb 2025 (v1), last revised 2 Mar 2026 (this version, v3)] Title:Polynomial, trigonometric, and tropical activations Authors:Ismail Khalfaoui-Hassani, Stefan Kesselheim View a PDF of the paper titled Polynomial, trigonometric, and tropical activations, by Ismail Khalfaoui-Hassani and Stefan Kesselheim View PDF Abstract:Which functions can be used as activations in deep neural networks? This article explores families of functions based on orthonormal bases, including the Hermite polynomial basis and the Fourier trigonometric basis, as well as a basis resulting from the tropicalization of a polynomial basis. Our study shows that, through simple variance-preserving initialization and without additional clamping mechanisms, these activations can successfully be used to train deep models, such as GPT-2 for next-token prediction on OpenWebText and ConvNeXt for image classification on ImageNet. Our work addresses the issue of exploding and vanishing activations and gradients, particularly prevalent with polynomial activations, and opens the door for improving the efficiency of large-scale learning tasks. Furthermore, our approach provides insight into the structure of neural networks, revealing that networks with polynomial activations can be interpreted as multivariate polynomial mappings. Finally, using Hermite interpolation, we show that our activations can closely approximate classical ones in pre-train...

Originally published on March 03, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 13 minutes ago

Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min · 13 minutes ago

Llms

Anthropic’s Unreleased Claude Mythos Might Be The Most Advanced AI Model Yet

Anthropic is testing an unreleased artificial intelligence (AI) model with capabilities that exceed any system it has previously released...

AI Tools & Products · 5 min · 27 minutes ago

Llms

LLM agents can trigger real actions now. But what actually stops them from executing?

We ran into a simple but important issue while building agents with tool calling: the model can propose actions but nothing actually enfo...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

[2502.01247] Polynomial, trigonometric, and tropical activations

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

Improving AI models’ ability to explain their predictions

Anthropic’s Unreleased Claude Mythos Might Be The Most Advanced AI Model Yet

LLM agents can trigger real actions now. But what actually stops them from executing?

No comments

Stay updated with AI News