Machine Learning Ai Agents Generative Ai Data Science

[2602.18451] Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design

arXiv - AI February 24, 2026 4 min read Article

Summary

This article discusses the development of a Multi-Agent System (MAS) that automates the generation of science assessments aligned with the Next Generation Science Standards (NGSS) using Evidence-Centered Design (ECD).

Why It Matters

As education increasingly emphasizes performance-based assessments, this research addresses the challenges of creating high-quality, standards-aligned assessments. By integrating AI with ECD, the study explores a scalable solution that could enhance educational assessment practices while highlighting the importance of human expertise.

Key Takeaways

The study integrates Evidence-Centered Design with Multi-Agent Systems for automated assessment generation.
AI-generated assessment items show comparable quality to human-developed items in alignment with NGSS standards.
AI excels in inclusivity but struggles with clarity and multimodal design.
Both AI and human assessments have weaknesses in evidence collectability and student interest alignment.
Human expertise remains crucial despite advancements in automated assessment generation.

Computer Science > Computers and Society arXiv:2602.18451 (cs) [Submitted on 3 Feb 2026] Title:Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design Authors:Yaxuan Yang, Jongchan Park, Yifan Zhou, Xiaoming Zhai View a PDF of the paper titled Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design, by Yaxuan Yang and 3 other authors View PDF HTML (experimental) Abstract:Contemporary science education reforms such as the Next Generation Science Standards (NGSS) demand assessments to understand students' ability to use science knowledge to solve problems and design solutions. To elicit such higher-order ability, educators need performance-based assessments, which are challenging to develop. One solution that has been broadly adopted is Evidence-Centered Design (ECD), which emphasizes interconnected models of the learner, evidence, and tasks. Although ECD provides a framework to safeguard assessment validity, its implementation requires diverse expertise (e.g., content and assessment), which is both costly and labor-intensive. To address this challenge, this study proposed integrating the ECD framework into Multi-Agent Systems (MAS) to generate NGSS-aligned assessment items automatically. This integrated MAS system ensembles multiple large language models with varying expertise, enabling the automation of complex, multi-stage item generation workflows traditionally per...

Read Original Article

[2602.18451] Developing a Multi-Agent System to Generate Next Generation Science Assessments with Evidence-Centered Design

Summary

Why It Matters

Key Takeaways

Related Articles

[D] ICML reviewer making up false claim in acknowledgement, what to do?

UMKC Announces New Master of Science in Artificial Intelligence

[D] Budget Machine Learning Hardware

Your prompts aren’t the problem — something else is

No comments

Stay updated with AI News