[2602.18843] ABD: Default Exception Abduction in Finite First Order Worlds
Summary
The paper introduces ABD, a benchmark for default-exception abduction in finite first-order worlds, evaluating LLMs on their ability to define exceptions while maintaining sparsity.
Why It Matters
This research is significant as it addresses the challenges of exception handling in AI models, particularly in finite first-order logic contexts. By formalizing observation regimes and evaluating state-of-the-art LLMs, it contributes to the understanding of model limitations and generalization failures, which is crucial for advancing AI reliability and performance.
Key Takeaways
- ABD serves as a benchmark for evaluating exception abduction in AI.
- The study formalizes three observation regimes: closed-world, existential completion, and universal completion.
- Evaluation of ten LLMs reveals high validity but notable gaps in parsimony.
- Distinct generalization failure modes were identified across different regimes.
- The findings highlight the need for improved models in handling exceptions effectively.
Computer Science > Artificial Intelligence arXiv:2602.18843 (cs) [Submitted on 21 Feb 2026] Title:ABD: Default Exception Abduction in Finite First Order Worlds Authors:Serafim Batzoglou View a PDF of the paper titled ABD: Default Exception Abduction in Finite First Order Worlds, by Serafim Batzoglou View PDF HTML (experimental) Abstract:We introduce ABD, a benchmark for default-exception abduction over finite first-order worlds. Given a background theory with an abnormality predicate and a set of relational structures, a model must output a first-order formula that defines exceptions, restoring satisfiability while keeping exceptions sparse. We formalize three observation regimes (closed-world, existential completion, universal completion) with exact SMT verification. Evaluating ten frontier LLMs on 600 instances, the best models achieve high validity but parsimony gaps remain, and holdout evaluation reveals distinct generalization failure modes across regimes. Subjects: Artificial Intelligence (cs.AI); Symbolic Computation (cs.SC) Cite as: arXiv:2602.18843 [cs.AI] (or arXiv:2602.18843v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2602.18843 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Serafim Batzoglou [view email] [v1] Sat, 21 Feb 2026 14:14:35 UTC (57 KB) Full-text links: Access Paper: View a PDF of the paper titled ABD: Default Exception Abduction in Finite First Order Worlds, by Serafim BatzoglouVie...