[2602.18880] FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model

[2602.18880] FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model

arXiv - AI 3 min read Article

Summary

The paper presents FOCA, a novel framework for detecting and localizing image forgery using a multi-modal large language model that integrates spatial and frequency domain features.

Why It Matters

As image tampering techniques evolve, ensuring the integrity of digital media becomes critical for public trust and security. FOCA addresses current limitations in forgery detection by enhancing interpretability and accuracy, which is vital for applications in digital forensics and media verification.

Key Takeaways

  • FOCA integrates RGB spatial and frequency domain features for improved forgery detection.
  • The framework enhances interpretability of tampering traces, making it user-friendly.
  • FSE-Set, a new dataset, supports diverse image analysis for training models.
  • FOCA outperforms existing methods in both detection performance and interpretability.
  • The research highlights the importance of cross-domain analysis in AI applications.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.18880 (cs) [Submitted on 21 Feb 2026] Title:FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model Authors:Zhou Liu, Tonghua Su, Hongshi Zhang, Fuxiang Yang, Donglin Di, Yang Song, Lei Fan View a PDF of the paper titled FOCA: Frequency-Oriented Cross-Domain Forgery Detection, Localization and Explanation via Multi-Modal Large Language Model, by Zhou Liu and 6 other authors View PDF HTML (experimental) Abstract:Advances in image tampering techniques, particularly generative models, pose significant challenges to media verification, digital forensics, and public trust. Existing image forgery detection and localization (IFDL) methods suffer from two key limitations: over-reliance on semantic content while neglecting textural cues, and limited interpretability of subtle low-level tampering traces. To address these issues, we propose FOCA, a multimodal large language model-based framework that integrates discriminative features from both the RGB spatial and frequency domains via a cross-attention fusion module. This design enables accurate forgery detection and localization while providing explicit, human-interpretable cross-domain explanations. We further introduce FSE-Set, a large-scale dataset with diverse authentic and tampered images, pixel-level masks, and dual-domain annotations. Extensive experiments show that FOCA outperforms state-o...

Related Articles

Llms

TRACER: Learn-to-Defer for LLM Classification with Formal Teacher-Agreement Guarantees

I'm releasing TRACER (Trace-Based Adaptive Cost-Efficient Routing), a library for learning cost-efficient routing policies from LLM trace...

Reddit - Machine Learning · 1 min ·
Mistral AI raises $830M in debt to set up a data center near Paris | TechCrunch
Llms

Mistral AI raises $830M in debt to set up a data center near Paris | TechCrunch

Mistral aims to start operating the data center by the second quarter of 2026.

TechCrunch - AI · 4 min ·
Llms

The Rationing: AI companies are using the "subsidize, addict, extract" playbook — and developers are the product

Anthropic just ran the classic platform playbook on developers: offer generous limits to build dependency, then tighten the screws once t...

Reddit - Artificial Intelligence · 1 min ·
Llms

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

Google AI (gai.google) gives Gemini-powered answers for technical queries — think AI-enhanced search with code understanding. I built a C...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime