[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires
Nlp

[2602.10149] Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2602.10149: Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires

Computer Science > Cryptography and Security arXiv:2602.10149 (cs) [Submitted on 9 Feb 2026 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires Authors:Ali Nour Eldin, Mohamed Sellami, Walid Gaaloul, Julien Steunou View a PDF of the paper titled Exploring Semantic Labeling Strategies for Third-Party Cybersecurity Risk Assessment Questionnaires, by Ali Nour Eldin and Mohamed Sellami and Walid Gaaloul and Julien Steunou View PDF HTML (experimental) Abstract:Third-Party Risk Assessment (TPRA) is a core cybersecurity practice for evaluating suppliers against standards such as ISO/IEC 27001 and NIST. TPRA questionnaires are typically drawn from large repositories of security and compliance questions, yet tailoring assessments to organizational needs remains a largely manual process. Existing retrieval approaches rely on keyword or surface-level similarity, which often fails to capture implicit assessment scope and control semantics. This paper explores strategies for organizing and retrieving TPRA cybersecurity questions using semantic labels that describe both control domains and assessment scope. We compare direct question-level labeling with a Large Language Model (LLM) against a hybrid semi-supervised semantic labeling (SSSL) pipeline that clusters questions in embedding space, labels a small representative subset using an LLM, and propagates labels to remaining questions...

Originally published on March 05, 2026. Curated by AI News.

Related Articles

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
Nlp

[P] Using YouTube as a data source (lessons from building a coffee domain dataset)

I started working on a small coffee coaching app recently - something that could answer questions around brew methods, grind size, extrac...

Reddit - Machine Learning · 1 min ·
[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·
[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime