[2602.22903] PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA
Summary
The paper presents PSQE, a method for enhancing pseudo seed quality in unsupervised multimodal entity alignment, addressing challenges in data integration for large language models.
Why It Matters
This research is significant as it tackles the limitations of current unsupervised methods in multimodal entity alignment, which is crucial for improving data integration across various applications. By enhancing pseudo seed quality, the proposed method can lead to better performance in large language models, which are increasingly used in diverse fields.
Key Takeaways
- PSQE improves the precision and coverage of pseudo seeds in entity alignment.
- The method addresses imbalances in graph coverage that hinder learning in sparse regions.
- Experimental results show significant performance improvements over baseline models.
- Theoretical analysis highlights the dual influence of pseudo seeds in contrastive learning.
- PSQE can be integrated as a plug-and-play module in existing systems.
Computer Science > Information Retrieval arXiv:2602.22903 (cs) [Submitted on 26 Feb 2026] Title:PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA Authors:Yunpeng Hong, Chenyang Bu, Jie Zhang, Yi He, Di Wu, Xindong Wu View a PDF of the paper titled PSQE: A Theoretical-Practical Approach to Pseudo Seed Quality Enhancement for Unsupervised MMEA, by Yunpeng Hong and 5 other authors View PDF HTML (experimental) Abstract:Multimodal Entity Alignment (MMEA) aims to identify equivalent entities across different data modalities, enabling structural data integration that in turn improves the performance of various large language model applications. To lift the requirement of labeled seed pairs that are difficult to obtain, recent methods shifted to an unsupervised paradigm using pseudo-alignment seeds. However, unsupervised entity alignment in multimodal settings remains underexplored, mainly because the incorporation of multimodal information often results in imbalanced coverage of pseudo-seeds within the knowledge graph. To overcome this, we propose PSQE (Pseudo-Seed Quality Enhancement) to improve the precision and graph coverage balance of pseudo seeds via multimodal information and clustering-resampling. Theoretical analysis reveals the impact of pseudo seeds on existing contrastive learning-based MMEA models. In particular, pseudo seeds can influence the attraction and the repulsion terms in contrastive learning at once, whereas imb...