[2603.29232] Long-Document QA with Chain-of-Structured-Thought and

[2603.29232] Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

arXiv - AI April 01, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.29232: Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

Computer Science > Computation and Language arXiv:2603.29232 (cs) [Submitted on 31 Mar 2026] Title:Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs Authors:Zhuowen Liang, Xiaotian Lin, Zhengxuan Zhang, Yuyu Luo, Haixun Wang, Nan Tang View a PDF of the paper titled Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs, by Zhuowen Liang and 5 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) are widely applied to data analytics over documents, yet direct reasoning over long, noisy documents remains brittle and error-prone. Hence, we study document question answering (QA) that consolidates dispersed evidence into a structured output (e.g., a table, graph, or chunks) to support reliable, verifiable QA. We propose a two-pillar framework, LiteCoST, to achieve both high accuracy and low latency with small language models (SLMs). Pillar 1: Chain-of-Structured-Thought (CoST). We introduce a CoST template, a schema-aware instruction that guides a strong LLM to produce both a step-wise CoST trace and the corresponding structured output. The process induces a minimal structure, normalizes entities/units, aligns records, serializes the output, and verifies/refines it, yielding auditable supervision. Pillar 2: SLM fine-tuning. The compact models are trained on LLM-generated CoST data in two stages: Supervised Fine-Tuning for structural alignment, followed by Group Relative Policy Optimization (GRPO) incorporating triple...

Originally published on April 01, 2026. Curated by AI News.

Llms

Gemini caught a $280M crypto exploit before it hit the news, then retracted it as a hallucination because I couldn't verify it - because the news hadn't dropped yet

So this happened mere hours ago and I feel like I genuinely stumbled onto something worth documenting for people interested in AI behavio...

Reddit - Artificial Intelligence · 1 min · 32 minutes ago

Llms

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

I am a solo developer who has been using all three seriously. Here is what I actually think: GPT-4o — Strengths: Large context window, st...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

You're giving feedback on a new version of ChatGPT

So I will be paying attention to these system messages more now- the last time I got one of these not so long back the 'tone' changed to ...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

Gemma 4 actually running usable on an Android phone (not llama.cpp)

I wanted a real local assistant on my phone, not a demo. First tried the usual llama.cpp in Termux — Gemma 4 was 2–3 tok/s and the phone ...

Reddit - Artificial Intelligence · 1 min · about 7 hours ago

[2603.29232] Long-Document QA with Chain-of-Structured-Thought and Fine-Tuned SLMs

About this article

Related Articles

Gemini caught a $280M crypto exploit before it hit the news, then retracted it as a hallucination because I couldn't verify it - because the news hadn't dropped yet

GPT-4 vs Claude vs Gemini for coding — honest breakdown after 3 months of daily use

You're giving feedback on a new version of ChatGPT

Gemma 4 actually running usable on an Android phone (not llama.cpp)

No comments

Stay updated with AI News