[2603.00160] DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops
About this article
Abstract page for arXiv paper 2603.00160: DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.00160 (cs) [Submitted on 25 Feb 2026] Title:DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops Authors:Boyang Deng, Yuzhen Lu View a PDF of the paper titled DINOv3 Meets YOLO26 for Weed Detection in Vegetable Crops, by Boyang Deng and Yuzhen Lu View PDF HTML (experimental) Abstract:Developing robust models for precision vegetable weeding is currently constrained by the scarcity of large-scale, annotated weed-crop datasets. To address this limitation, this study proposes a foundational crop-weed detection model by integrating heterogeneous datasets and leveraging self-supervised learning. A total of 618,642 crop-weed images were initially collected and subsequently refined to 199,388 filtered images for fine-tuning a DINOv3 vision transformer (ViT-small) through a sequential curation strategy. The fine-tuned DINOv3 backbone was then integrated into YOLO26, serving either as a primary backbone or part of a dual-backbone architecture. A feature alignment loss was introduced in the dual backbone framework to enhance feature fusion with minimal computational overhead. Experimental results show that the proposed DINOv3-finetuned ViT-small-based YOLO26-large achieved up to a +5.4% mAP50 gain on in-domain images collected in the 2025 season. Moreover, it demonstrated strong cross-domain generalization with mAP50 improvements of +14.0% on the 2021-2023 season dataset and +11.9% on the 2024 season dataset, ...