[2602.18351] Validating Political Position Predictions of Arguments
Summary
This article presents a dual-scale validation framework for assessing political position predictions in argumentative discourse, utilizing 22 language models on a dataset of 23,228 arguments from UK debates.
Why It Matters
The research addresses the challenge of validating subjective attributes like political positions, which are often inadequately captured by traditional methods. By developing a new validation methodology, it enhances the reliability of political stance predictions, which can inform discourse analysis and AI applications in political contexts.
Key Takeaways
- Introduces a dual-scale validation framework for political stance prediction.
- Demonstrates moderate agreement between human and model predictions using pointwise evaluation.
- Shows stronger alignment in rankings through pairwise validation, indicating improved model reliability.
- Provides a structured argumentation knowledge base for enhanced reasoning in political discourse.
- Advances knowledge representation techniques for subjective real-world data.
Computer Science > Computation and Language arXiv:2602.18351 (cs) [Submitted on 20 Feb 2026] Title:Validating Political Position Predictions of Arguments Authors:Jordan Robinson, Angus R. Williams, Katie Atkinson, Anthony G. Cohn View a PDF of the paper titled Validating Political Position Predictions of Arguments, by Jordan Robinson and 2 other authors View PDF Abstract:Real-world knowledge representation often requires capturing subjective, continuous attributes -- such as political positions -- that conflict with pairwise validation, the widely accepted gold standard for human evaluation. We address this challenge through a dual-scale validation framework applied to political stance prediction in argumentative discourse, combining pointwise and pairwise human annotation. Using 22 language models, we construct a large-scale knowledge base of political position predictions for 23,228 arguments drawn from 30 debates that appeared on the UK politicial television programme \textit{Question Time}. Pointwise evaluation shows moderate human-model agreement (Krippendorff's $\alpha=0.578$), reflecting intrinsic subjectivity, while pairwise validation reveals substantially stronger alignment between human- and model-derived rankings ($\alpha=0.86$ for the best model). This work contributes: (i) a practical validation methodology for subjective continuous knowledge that balances scalability with reliability; (ii) a validated structured argumentation knowledge base enabling graph-ba...