[2508.01277] Foundation Models for Bioacoustics -- a Comparative Review
About this article
Abstract page for arXiv paper 2508.01277: Foundation Models for Bioacoustics -- a Comparative Review
Computer Science > Sound arXiv:2508.01277 (cs) [Submitted on 2 Aug 2025 (v1), last revised 29 Mar 2026 (this version, v2)] Title:Foundation Models for Bioacoustics -- a Comparative Review Authors:Raphael Schwinger, Paria Vali Zadeh, Lukas Rauch, Mats Kurz, Tom Hauschild, Sam Lapp, Sven Tomforde View a PDF of the paper titled Foundation Models for Bioacoustics -- a Comparative Review, by Raphael Schwinger and 6 other authors View PDF Abstract:Automated bioacoustic analysis is essential for biodiversity monitoring and conservation, requiring advanced deep learning models that can adapt to diverse bioacoustic tasks. This article presents a comprehensive review of large-scale pretrained bioacoustic foundation models and systematically investigates their transferability across multiple bioacoustic classification tasks. We overview bioacoustic representation learning by analysing pretraining data sources and benchmarks. On this basis, we review bioacoustic foundation models, dissecting the models' training data, preprocessing, augmentations, architecture, and training paradigm. Additionally, we conduct an extensive empirical study of selected models on the BEANS and BirdSet benchmarks, evaluating generalisability under linear and attentive probing. Our experimental analysis reveals that Perch~2.0 achieves the highest BirdSet score (restricted evaluation) and the strongest linear probing result on BEANS, building on diverse multi-taxa supervised pretraining; that BirdMAE is the b...