[2603.28295] Evaluating LLMs for Answering Student Questions in

[2603.28295] Evaluating LLMs for Answering Student Questions in Introductory Programming Courses

arXiv - AI March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.28295: Evaluating LLMs for Answering Student Questions in Introductory Programming Courses

Computer Science > Artificial Intelligence arXiv:2603.28295 (cs) [Submitted on 30 Mar 2026] Title:Evaluating LLMs for Answering Student Questions in Introductory Programming Courses Authors:Thomas Van Mullem, Bart Mesuere, Peter Dawyndt View a PDF of the paper titled Evaluating LLMs for Answering Student Questions in Introductory Programming Courses, by Thomas Van Mullem and 2 other authors View PDF HTML (experimental) Abstract:The rapid emergence of Large Language Models (LLMs) presents both opportunities and challenges for programming education. While students increasingly use generative AI tools, direct access often hinders the learning process by providing complete solutions rather than pedagogical hints. Concurrently, educators face significant workload and scalability challenges when providing timely, personalized feedback. This study investigates the capabilities of LLMs to safely and effectively assist educators in answering student questions within a CS1 programming course. To achieve this, we established a rigorous, reproducible evaluation process by curating a benchmark dataset of 170 authentic student questions from a learning management system, paired with ground-truth responses authored by subject matter experts. Because traditional text-matching metrics are insufficient for evaluating open-ended educational responses, we developed and validated a custom LLM-as-a-Judge metric optimized for assessing pedagogical accuracy. Our findings demonstrate that models, ...

Originally published on March 31, 2026. Curated by AI News.

Llms

OpenClaw security checklist: practical safeguards for AI agents

Here is one of the better quality guides on the ensuring safety when deploying OpenClaw: https://chatgptguide.ai/openclaw-security-checkl...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Llms

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge

Gemini in Google Maps is a surprisingly useful way to explore new territory.

The Verge - AI · 11 min · about 5 hours ago

Llms

The person who replaces you probably won't be AI. It'll be someone from the next department over who learned to use it - opinion/discussion

I'm a strategy person by background. Two years ago I'd write a recommendation and hand it to a product team. Now.. I describe what I want...

Reddit - Artificial Intelligence · 1 min · about 12 hours ago

Llms

Block Resets Management With AI As Cash App Adds Installment Transfers

Block (NYSE:XYZ) plans a permanent organizational overhaul that replaces many middle management roles with AI-driven models to create fla...

AI Tools & Products · 5 min · about 15 hours ago

[2603.28295] Evaluating LLMs for Answering Student Questions in Introductory Programming Courses

About this article

Related Articles

OpenClaw security checklist: practical safeguards for AI agents

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge

The person who replaces you probably won't be AI. It'll be someone from the next department over who learned to use it - opinion/discussion

Block Resets Management With AI As Cash App Adds Installment Transfers

No comments

Stay updated with AI News