[2511.19299] Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
About this article
Abstract page for arXiv paper 2511.19299: Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
Computer Science > Machine Learning arXiv:2511.19299 (cs) COVID-19 e-print Important: e-prints posted on arXiv are not peer-reviewed by arXiv; they should not be relied upon without context to guide clinical practice or health-related behavior and should not be reported in news media as established information without consulting multiple experts in the field. [Submitted on 24 Nov 2025 (v1), last revised 23 Mar 2026 (this version, v2)] Title:Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning Authors:James R. M. Black, Moritz S. Hanke, Aaron Maiwald, Tina Hernandez-Boussard, Oliver M. Crook, Jaspreet Pannu View a PDF of the paper titled Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning, by James R. M. Black and 5 other authors View PDF HTML (experimental) Abstract:Novel deep learning architectures are increasingly being applied to biological data, including genetic sequences. These models, referred to as genomic language models (gLMs), have demonstrated impressive predictive and generative capabilities, raising concerns that such models may also enable misuse, for instance via the generation of genomes for human-infecting viruses. These concerns have catalyzed calls for risk mitigation measures. The de facto mitigation of choice is filtering of pretraining data (i.e., removing viral genomic sequences from training datasets) in order to limit gLM performance on virus-related tasks. Howeve...