[2602.17747] AgriVariant: Variant Effect Prediction using DeepChem-Variant for Precision Breeding in Rice
Summary
The article presents AgriVariant, a deep learning-based pipeline for predicting the effects of genetic variants in rice, enhancing precision breeding by significantly reducing analysis time and costs.
Why It Matters
This research addresses a critical bottleneck in crop genetics by providing a computational tool for variant effect prediction, which can accelerate the development of climate-resilient rice varieties. It highlights the potential of integrating machine learning with genomics to improve agricultural outcomes.
Key Takeaways
- AgriVariant integrates deep learning for efficient variant effect prediction in rice.
- The pipeline can analyze genetic variants in days, compared to years with traditional methods.
- It enables breeders to prioritize variants for validation, reducing costs and time.
- The approach is adaptable to other crop species with available genomic data.
- The study demonstrates the potential of computational tools in enhancing precision agriculture.
Quantitative Biology > Genomics arXiv:2602.17747 (q-bio) [Submitted on 19 Feb 2026] Title:AgriVariant: Variant Effect Prediction using DeepChem-Variant for Precision Breeding in Rice Authors:Ankita Vaishnobi Bisoi, Bharath Ramsundar View a PDF of the paper titled AgriVariant: Variant Effect Prediction using DeepChem-Variant for Precision Breeding in Rice, by Ankita Vaishnobi Bisoi and 1 other authors View PDF HTML (experimental) Abstract:Predicting functional consequences of genetic variants in crop genes remains a critical bottleneck for precision breeding programs. We present AgriVariant, an end-to-end pipeline for variant-effect prediction in rice (Oryza sativa) that addresses the lack of crop-specific variant-interpretation tools and can be extended to any crop species with available reference genomes and gene annotations. Our approach integrates deep learning-based variant calling (DeepChem-Variant) with custom plant genomics annotation using RAP-DB gene models and database-independent deleteriousness scoring that combines the Grantham distance and the BLOSUM62 substitution matrix. We validate the pipeline through targeted mutations in stress-response genes (OsDREB2a, OsDREB1F, SKC1), demonstrating correct classification of stop-gained, missense, and synonymous variants with appropriate HIGH / MODERATE / LOW impact assignments. An exhaustive mutagenesis study of OsMT-3a analyzed all 1,509 possible single-nucleotide variants in 10 days, identifying 353 high-impact, 447...