[2503.02976] Teaching AI to Handle Exceptions: Supervised Fine-Tuning

[2503.02976] Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

arXiv - AI April 02, 2026 4 min read

About this article

Abstract page for arXiv paper 2503.02976: Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

Computer Science > Artificial Intelligence arXiv:2503.02976 (cs) [Submitted on 4 Mar 2025 (v1), last revised 31 Mar 2026 (this version, v3)] Title:Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment Authors:Matthew DosSantos DiSorbo, Harang Ju, Sinan Aral View a PDF of the paper titled Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment, by Matthew DosSantos DiSorbo and 2 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs), initially developed for generative AI, are now evolving into agentic AI systems, which make decisions in complex, real-world contexts. Unfortunately, while their generative capabilities are well-documented, their decision-making processes remain poorly understood. This is particularly evident when testing targeted decision-making: for instance, how models handle exceptions, a critical and challenging aspect of decision-making made relevant by the inherent incompleteness of contracts. Here we demonstrate that LLMs, even ones that excel at reasoning, deviate significantly from human judgments because they adhere strictly to policies, even when such adherence is impractical, suboptimal, or even counterproductive. We then evaluate three approaches to tuning AI agents to handle exceptions: ethical framework prompting, chain-of-thought reasoning, and supervised fine-tuning. We find that while ethical framework prompting fails and chain-of-thought prompting provides ...

Originally published on April 02, 2026. Curated by AI News.

Llms

[2512.02966] Lumos: Let there be Language Model System Certification

Abstract page for arXiv paper 2512.02966: Lumos: Let there be Language Model System Certification

arXiv - AI · 4 min · about 5 hours ago

Llms

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections

Abstract page for arXiv paper 2602.00750: Bypassing Prompt Injection Detectors through Evasive Injections

arXiv - AI · 4 min · about 5 hours ago

Llms

[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

Abstract page for arXiv paper 2511.08225: Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

arXiv - AI · 4 min · about 5 hours ago

Llms

[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

Abstract page for arXiv paper 2511.20224: DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

arXiv - AI · 3 min · about 5 hours ago

[2503.02976] Teaching AI to Handle Exceptions: Supervised Fine-Tuning with Human-Aligned Judgment

About this article

Related Articles

[2512.02966] Lumos: Let there be Language Model System Certification

[2602.00750] Bypassing Prompt Injection Detectors through Evasive Injections

[2511.08225] Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Feedback

[2511.20224] DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

No comments

Stay updated with AI News