[2602.13712] Fine-tuned Vision Language Model for Localization of Parasitic Eggs in Microscopic Images

[2602.13712] Fine-tuned Vision Language Model for Localization of Parasitic Eggs in Microscopic Images

arXiv - Machine Learning 3 min read Article

Summary

This paper presents a fine-tuned Vision Language Model (VLM) designed for the localization of parasitic eggs in microscopic images, demonstrating superior performance compared to existing object detection methods.

Why It Matters

The research addresses the challenge of diagnosing soil-transmitted helminth infections, which affect many in tropical regions. By automating the localization of parasitic eggs, this model could enhance diagnostic accuracy and efficiency, ultimately improving public health outcomes.

Key Takeaways

  • The proposed VLM shows an mIOU of 0.94, outperforming traditional object detection methods.
  • Automating parasitic egg localization can reduce human error and increase diagnostic efficiency.
  • This model has potential applications in regions with limited access to specialized diagnostic expertise.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.13712 (cs) [Submitted on 14 Feb 2026] Title:Fine-tuned Vision Language Model for Localization of Parasitic Eggs in Microscopic Images Authors:Chan Hao Sien, Hezerul Abdul Karim, Nouar AlDahoul View a PDF of the paper titled Fine-tuned Vision Language Model for Localization of Parasitic Eggs in Microscopic Images, by Chan Hao Sien and 2 other authors View PDF Abstract:Soil-transmitted helminth (STH) infections continuously affect a large proportion of the global population, particularly in tropical and sub-tropical regions, where access to specialized diagnostic expertise is limited. Although manual microscopic diagnosis of parasitic eggs remains the diagnostic gold standard, the approach can be labour-intensive, time-consuming, and prone to human error. This paper aims to utilize a vision language model (VLM) such as Microsoft Florence that was fine-tuned to localize all parasitic eggs within microscopic images. The preliminary results show that our localization VLM performs comparatively better than the other object detection methods, such as EfficientDet, with an mIOU of 0.94. This finding demonstrates the potential of the proposed VLM to serve as a core component of an automated framework, offering a scalable engineering solution for intelligent parasitological diagnosis. Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG) Cite as: arXiv:2602.13712 [cs.CV]   (or arXiv:2...

Related Articles

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Shifting to AI model customization is an architectural imperative | MIT Technology Review
Llms

Shifting to AI model customization is an architectural imperative | MIT Technology Review

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every ...

MIT Technology Review · 6 min ·
Llms

Artificial intelligence will always depends on human otherwise it will be obsolete.

I was looking for a tool for my specific need. There was not any. So i started to write the program in python, just basic structure. Then...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime