Llms Machine Learning Computer Vision Ai Safety Generative Ai

[2602.18880] FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model

arXiv - AI February 24, 2026 3 min read Article

Summary

The paper presents FOCA, a novel framework for detecting and localizing image forgery using a multi-modal large language model that integrates spatial and frequency domain features.

Why It Matters

As image tampering techniques evolve, ensuring the integrity of digital media becomes critical for public trust and security. FOCA addresses current limitations in forgery detection by enhancing interpretability and accuracy, which is vital for applications in digital forensics and media verification.

Key Takeaways

FOCA integrates RGB spatial and frequency domain features for improved forgery detection.
The framework enhances interpretability of tampering traces, making it user-friendly.
FSE-Set, a new dataset, supports diverse image analysis for training models.
FOCA outperforms existing methods in both detection performance and interpretability.
The research highlights the importance of cross-domain analysis in AI applications.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.18880 (cs) [Submitted on 21 Feb 2026] Title:FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model Authors:Zhou Liu, Tonghua Su, Hongshi Zhang, Fuxiang Yang, Donglin Di, Yang Song, Lei Fan View a PDF of the paper titled FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model, by Zhou Liu and 6 other authors View PDF HTML (experimental) Abstract:Advances in image tampering techniques, particularly generative models, pose significant challenges to media verification, digital forensics, and public trust. Existing image forgery detection and localization (IFDL) methods suffer from two key limitations: over-reliance on semantic content while neglecting textural cues, and limited interpretability of subtle low-level tampering traces. To address these issues, we propose FOCA, a multimodal large language model-based framework that integrates discriminative features from both the RGB spatial and frequency domains via a cross-attention fusion module. This design enables accurate forgery detection and localization while providing explicit, human-interpretable cross-domain explanations. We further introduce FSE-Set, a large-scale dataset with diverse authentic and tampered images, pixel-level masks, and dual-domain annotations. Extensive experiments show that FOCA outperforms state-o...

Read Original Article

[2602.18880] FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model

Summary

Why It Matters

Key Takeaways

Related Articles

TRACER: Learn-to-Defer for LLM Classification with Formal Teacher-Agreement Guarantees

Mistral AI raises $830M in debt to set up a data center near Paris | TechCrunch

The Rationing: AI companies are using the "subsidize, addict, extract" playbook — and developers are the product

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

No comments

Stay updated with AI News