[2601.09768] CLiMB: A Domain-Informed Novelty Detection Clustering Framework for Galactic Archaeology and Scientific Discovery

[2601.09768] CLiMB: A Domain-Informed Novelty Detection Clustering Framework for Galactic Archaeology and Scientific Discovery

arXiv - AI 4 min read Article

Summary

The paper presents CLiMB, a novel framework for novelty detection in galactic archaeology, enhancing clustering methods to identify unknown structures in astrophysical data.

Why It Matters

This research addresses the limitations of current clustering algorithms in astrophysics, enabling better identification of novel astronomical phenomena. By improving data efficiency and accuracy in classifying celestial objects, it contributes to advancements in scientific discovery and our understanding of the universe.

Key Takeaways

  • CLiMB decouples prior knowledge exploitation from unknown structure exploration.
  • It achieves a high Adjusted Rand Index of 0.829, outperforming existing methods.
  • The framework demonstrates superior data efficiency with improved performance as knowledge increases.
  • CLiMB successfully isolates distinct dynamical features in unlabelled data.
  • This approach has significant implications for future discoveries in galactic archaeology.

Astrophysics > Instrumentation and Methods for Astrophysics arXiv:2601.09768 (astro-ph) [Submitted on 14 Jan 2026 (v1), last revised 24 Feb 2026 (this version, v2)] Title:CLiMB: A Domain-Informed Novelty Detection Clustering Framework for Galactic Archaeology and Scientific Discovery Authors:Lorenzo Monti, Tatiana Muraveva, Brian Sheridan, Davide Massari, Alessia Garofalo, Gisella Clementini, Umberto Michelucci View a PDF of the paper titled CLiMB: A Domain-Informed Novelty Detection Clustering Framework for Galactic Archaeology and Scientific Discovery, by Lorenzo Monti and 6 other authors View PDF HTML (experimental) Abstract:In data-driven scientific discovery, a challenge lies in classifying well-characterized phenomena while identifying novel anomalies. Current semi-supervised clustering algorithms do not always fully address this duality, often assuming that supervisory signals are globally representative. Consequently, methods often enforce rigid constraints that suppress unanticipated patterns or require a pre-specified number of clusters, rendering them ineffective for genuine novelty detection. To bridge this gap, we introduce CLiMB (CLustering in Multiphase Boundaries), a domain-informed framework decoupling the exploitation of prior knowledge from the exploration of unknown structures. Using a sequential two-phase approach, CLiMB first anchors known clusters using metric-adaptive constrained partitioning, and subsequently applies density-based clustering to res...

Related Articles

Llms

Study: LLMs Able to De-Anonymize User Accounts on Reddit, Hacker News & Other "Pseudonymous" Platforms; Report Co-Author Expands, Advises

Advice from the study's co-author: "Be aware that it’s not any single post that identifies you, but the combination of small details acro...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] Best websites for pytorch/numpy interviews

Hello, I’m at the last year of my PHD and I’m starting to prepare interviews. I’m mainly aiming at applied scientist/research engineer or...

Reddit - Machine Learning · 1 min ·
Llms

[P] Remote sensing foundation models made easy to use.

This project enables the idea of tasking remote sensing models to acquire embeddings like we task satellites to acquire data! https://git...

Reddit - Machine Learning · 1 min ·
Machine Learning

Can AI truly be creative?

AI has no imagination. “Creativity is the ability to generate novel and valuable ideas or works through the exercise of imagination” http...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime