[2506.08618] HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

[2506.08618] HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2506.08618: HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals

Computer Science > Machine Learning arXiv:2506.08618 (cs) [Submitted on 10 Jun 2025 (v1), last revised 5 Mar 2026 (this version, v3)] Title:HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals Authors:Xianquan Yan, Hakan Akgün, Kenji Kawaguchi, N. Duane Loh, Ching Hua Lee View a PDF of the paper titled HSG-12M: A Large-Scale Benchmark of Spatial Multigraphs from the Energy Spectra of Non-Hermitian Crystals, by Xianquan Yan and 4 other authors View PDF HTML (experimental) Abstract:AI is transforming scientific research by revealing new ways to understand complex physical systems, but its impact remains constrained by the lack of large, high-quality domain-specific datasets. A rich, largely untapped resource lies in non-Hermitian quantum physics, where the energy spectra of crystals form intricate geometries on the complex plane -- termed as Hamiltonian spectral graphs. Despite their significance as fingerprints for electronic behavior, their systematic study has been intractable due to the reliance on manual extraction. To unlock this potential, we introduce Poly2Graph: a high-performance, open-source pipeline that automates the mapping of 1-D crystal Hamiltonians to spectral graphs. Using this tool, we present HSG-12M: a dataset containing 11.6 million static and 5.1 million dynamic Hamiltonian spectral graphs across 1401 characteristic-polynomial classes, distilled from 177 TB of spectral potential data. Crucially, HSG-...

Originally published on March 06, 2026. Curated by AI News.

Related Articles

Llms

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

Inspired by Andrej Karpathy's AutoResearch, I built a system where Claude Code acts as an autonomous ML researcher on tabular binary clas...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Data curation and targeted replacement as a pre-training alignment and controllability method

Hi, r/MachineLearning: has much research been done in large-scale training scenarios where undesirable data has been replaced before trai...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I tested Meta’s brain-response model on posts. It predicted the Elon one almost perfectly.

I built an experimental UI and visualization layer around Meta’s open brain-response model just to see whether this stuff actually works ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Why does it seem like open source materials on ML are incomplete? this is not enough...

Many times when I try to deeply understand a topic in machine learning — whether it's a new architecture, a quantization method, a full t...

Reddit - Machine Learning · 1 min ·
More in Data Science: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime