[2602.22213] Enriching Taxonomies Using Large Language Models

[2602.22213] Enriching Taxonomies Using Large Language Models

arXiv - AI 3 min read Article

Summary

The paper presents Taxoria, a novel pipeline that enhances existing taxonomies using Large Language Models (LLMs), addressing issues of limited coverage and outdated nodes.

Why It Matters

Taxonomies are crucial for effective knowledge retrieval across various domains. By leveraging LLMs, Taxoria improves the quality and relevance of taxonomies, which can significantly enhance information organization and retrieval processes in fields like AI and data science.

Key Takeaways

  • Taxoria enhances existing taxonomies by using LLMs to propose new nodes.
  • The pipeline validates candidate nodes to reduce errors and ensure relevance.
  • The enriched taxonomy includes provenance tracking for better analysis.

Computer Science > Information Retrieval arXiv:2602.22213 (cs) [Submitted on 21 Nov 2025] Title:Enriching Taxonomies Using Large Language Models Authors:Zeinab Ghamlouch, Mehwish Alam View a PDF of the paper titled Enriching Taxonomies Using Large Language Models, by Zeinab Ghamlouch and 1 other authors View PDF HTML (experimental) Abstract:Taxonomies play a vital role in structuring and categorizing information across domains. However, many existing taxonomies suffer from limited coverage and outdated or ambiguous nodes, reducing their effectiveness in knowledge retrieval. To address this, we present Taxoria, a novel taxonomy enrichment pipeline that leverages Large Language Models (LLMs) to enhance a given taxonomy. Unlike approaches that extract internal LLM taxonomies, Taxoria uses an existing taxonomy as a seed and prompts an LLM to propose candidate nodes for enrichment. These candidates are then validated to mitigate hallucinations and ensure semantic relevance before integration. The final output includes an enriched taxonomy with provenance tracking and visualization of the final merged taxonomy for analysis. Comments: Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL) Cite as: arXiv:2602.22213 [cs.IR]   (or arXiv:2602.22213v1 [cs.IR] for this version)   https://doi.org/10.48550/arXiv.2602.22213 Focus to learn more arXiv-issued DOI via DataCite Journal reference: FAIA 2025 5147-5150 (2025) Related DOI: https:...

Related Articles

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage
Llms

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Abstract page for arXiv paper 2603.23966: Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

arXiv - AI · 4 min ·
[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios
Llms

[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios

Abstract page for arXiv paper 2603.16790: InCoder-32B: Code Foundation Model for Industrial Scenarios

arXiv - AI · 4 min ·
[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence
Llms

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

Abstract page for arXiv paper 2603.16430: EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv - AI · 4 min ·
[2603.11066] Exploring Collatz Dynamics with Human-LLM Collaboration
Llms

[2603.11066] Exploring Collatz Dynamics with Human-LLM Collaboration

Abstract page for arXiv paper 2603.11066: Exploring Collatz Dynamics with Human-LLM Collaboration

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime