[2510.08919] PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies

[2510.08919] PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning

arXiv - Machine Learning March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2510.08919: PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning

Computer Science > Computer Vision and Pattern Recognition arXiv:2510.08919 (cs) [Submitted on 10 Oct 2025 (v1), last revised 2 Mar 2026 (this version, v2)] Title:PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning Authors:Daiki Yoshikawa, Takashi Matsubara View a PDF of the paper titled PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning, by Daiki Yoshikawa and 1 other authors View PDF HTML (experimental) Abstract:Vision-language models have achieved remarkable success in multi-modal representation learning from large-scale pairs of visual scenes and linguistic descriptions. However, they still struggle to simultaneously express two distinct types of semantic structures: the hierarchy within a concept family (e.g., dog $\preceq$ mammal $\preceq$ animal) and the compositionality across different concept families (e.g., "a dog in a car" $\preceq$ dog, car). Recent works have addressed this challenge by employing hyperbolic space, which efficiently captures tree-like hierarchy, yet its suitability for representing compositionality remains unclear. To resolve this dilemma, we propose PHyCLIP, which employs an $\ell_1$-Product metric on a Cartesian product of Hyperbolic factors. With our design, intra-family hierarchies emerge within individual hyperbolic factors, and cross-family composition is captured by the $\ell_1$-pro...

Originally published on March 03, 2026. Curated by AI News.

Llms

Claude Mythos and misguided open-weight fearmongering

AI Tools & Products · 9 min · about 3 hours ago

Llms

Anthropic Agrees to Rent CoreWeave AI Capacity to Power Claude

AI Tools & Products · 1 min · about 3 hours ago

Llms

CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%

AI Tools & Products · 3 min · about 3 hours ago

Llms

Walmart’s AI Push Links Gemini App Experience With U.S. Manufacturing Shift

AI Tools & Products · 6 min · about 3 hours ago

[2510.08919] PHyCLIP: $\ell_1$-Product of Hyperbolic Factors Unifies Hierarchy and Compositionality in Vision-Language Representation Learning

About this article

Related Articles

Claude Mythos and misguided open-weight fearmongering

Anthropic Agrees to Rent CoreWeave AI Capacity to Power Claude

CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%

Walmart’s AI Push Links Gemini App Experience With U.S. Manufacturing Shift

No comments

Stay updated with AI News