[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires
About this article
Abstract page for arXiv paper 2602.10149: Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires
Computer Science > Cryptography and Security arXiv:2602.10149 (cs) [Submitted on 9 Feb 2026 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires Authors:Ali Nour Eldin, Mohamed Sellami, Walid Gaaloul, Julien Steunou View a PDF of the paper titled Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires, by Ali Nour Eldin and Mohamed Sellami and Walid Gaaloul and Julien Steunou View PDF HTML (experimental) Abstract:Third-Party Risk Assessment (TPRA) is a core cybersecurity practice for evaluating suppliers against standards such as ISO/IEC 27001 and NIST. TPRA questionnaires are typically drawn from large repositories of security and compliance questions, yet tailoring assessments to organizational needs remains a largely manual process. Existing retrieval approaches rely on keyword or surface-level similarity, which often fails to capture implicit assessment scope and control semantics. This paper explores strategies for organizing and retrieving TPRA cybersecurity questions using semantic labels that describe both control domains and assessment scope. We compare direct question-level labeling with a Large Language Model (LLM) against a hybrid semi-supervised semantic labeling (SSSL) pipeline that clusters questions in embedding space, labels a small representative subset using an LLM, and propagates labels to remaining questions...