[2507.19418] DEFNet: Multitasks-based Deep Evidential Fusion Network for Blind Image Quality Assessment

[2507.19418] DEFNet: Multitasks-based Deep Evidential Fusion Network for Blind Image Quality Assessment

arXiv - AI 3 min read Article

Summary

The paper introduces DEFNet, a multitask-based deep evidential fusion network designed to enhance blind image quality assessment (BIQA) by integrating auxiliary tasks and advanced uncertainty estimation techniques.

Why It Matters

This research addresses the limitations of existing BIQA methods by proposing a robust framework that improves performance through multitask optimization and effective uncertainty estimation. As image quality assessment is crucial in various applications, advancements in this area can lead to better visual experiences and more reliable image processing technologies.

Key Takeaways

  • DEFNet enhances blind image quality assessment through multitask optimization.
  • A novel trustworthy information fusion strategy improves feature integration.
  • The framework employs advanced uncertainty estimation techniques for better reliability.
  • Extensive experiments validate DEFNet's effectiveness on various datasets.
  • The model shows strong generalization capabilities for unseen scenarios.

Computer Science > Computer Vision and Pattern Recognition arXiv:2507.19418 (cs) [Submitted on 25 Jul 2025] Title:DEFNet: Multitasks-based Deep Evidential Fusion Network for Blind Image Quality Assessment Authors:Yiwei Lou, Yuanpeng He, Rongchao Zhang, Yongzhi Cao, Hanpin Wang, Yu Huang View a PDF of the paper titled DEFNet: Multitasks-based Deep Evidential Fusion Network for Blind Image Quality Assessment, by Yiwei Lou and 5 other authors View PDF HTML (experimental) Abstract:Blind image quality assessment (BIQA) methods often incorporate auxiliary tasks to improve performance. However, existing approaches face limitations due to insufficient integration and a lack of flexible uncertainty estimation, leading to suboptimal performance. To address these challenges, we propose a multitasks-based Deep Evidential Fusion Network (DEFNet) for BIQA, which performs multitask optimization with the assistance of scene and distortion type classification tasks. To achieve a more robust and reliable representation, we design a novel trustworthy information fusion strategy. It first combines diverse features and patterns across sub-regions to enhance information richness, and then performs local-global information fusion by balancing fine-grained details with coarse-grained context. Moreover, DEFNet exploits advanced uncertainty estimation technique inspired by evidential learning with the help of normal-inverse gamma distribution mixture. Extensive experiments on both synthetic and aut...

Related Articles

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap
Computer Vision

[2602.09678] Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

Abstract page for arXiv paper 2602.09678: Administrative Law's Fourth Settlement: AI and the Capability-Accountability Trap

arXiv - AI · 4 min ·
[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models
Llms

[2601.13622] CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language Models

Abstract page for arXiv paper 2601.13622: CARPE: Context-Aware Image Representation Prioritization via Ensemble for Large Vision-Language...

arXiv - AI · 3 min ·
[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones
Computer Vision

[2603.26551] Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

Abstract page for arXiv paper 2603.26551: Beyond MACs: Hardware Efficient Architecture Design for Vision Backbones

arXiv - AI · 4 min ·
[2603.26292] findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding
Llms

[2603.26292] findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

Abstract page for arXiv paper 2603.26292: findsylls: A Language-Agnostic Toolkit for Syllable-Level Speech Tokenization and Embedding

arXiv - AI · 3 min ·
More in Computer Vision: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime