[2508.09428] What-Meets-Where: Unified Learning of Action and Contact

[2508.09428] What-Meets-Where: Unified Learning of Action and Contact Localization in Images

arXiv - AI March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2508.09428: What-Meets-Where: Unified Learning of Action and Contact Localization in Images

Computer Science > Computer Vision and Pattern Recognition arXiv:2508.09428 (cs) [Submitted on 13 Aug 2025 (v1), last revised 28 Mar 2026 (this version, v2)] Title:What-Meets-Where: Unified Learning of Action and Contact Localization in Images Authors:Yuxiao Wang, Yu Lei, Wolin Liang, Weiying Xue, Zhenao Wei, Nan Zhuang, Qi Liu View a PDF of the paper titled What-Meets-Where: Unified Learning of Action and Contact Localization in Images, by Yuxiao Wang and 6 other authors View PDF HTML (experimental) Abstract:People control their bodies to establish contact with the environment. To comprehensively understand actions across diverse visual contexts, it is essential to simultaneously consider \textbf{what} action is occurring and \textbf{where} it is happening. Current methodologies, however, often inadequately capture this duality, typically failing to jointly model both action semantics and their spatial contextualization within scenes. To bridge this gap, we introduce a novel vision task that simultaneously predicts high-level action semantics and fine-grained body-part contact regions. Our proposed framework, PaIR-Net, comprises three key components: the Contact Prior Aware Module (CPAM) for identifying contact-relevant body parts, the Prior-Guided Concat Segmenter (PGCS) for pixel-wise contact segmentation, and the Interaction Inference Module (IIM) responsible for integrating global interaction relationships. To facilitate this task, we present PaIR (Part-aware Interact...

Originally published on March 31, 2026. Curated by AI News.

Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Machine Learning

[D] Best websites for pytorch/numpy interviews

Hello, I’m at the last year of my PHD and I’m starting to prepare interviews. I’m mainly aiming at applied scientist/research engineer or...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

Can AI truly be creative?

AI has no imagination. “Creativity is the ability to generate novel and valuable ideas or works through the exercise of imagination” http...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2508.09428] What-Meets-Where: Unified Learning of Action and Contact Localization in Images

About this article

Related Articles

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

[D] Best websites for pytorch/numpy interviews

[P] Remote sensing foundation models made easy to use.

Can AI truly be creative?

No comments

Stay updated with AI News