CAIRE’s Academic Highlights
Bridging the gap between artificial intelligence and real-world classrooms. Our research provides evidence-based insights and solutions that empower educators to enhance instructional quality.
SciEval: A Benchmark for Automatic Evaluation of K–12 Science Instructional Materials
Status: Accepted as FULL Paper at AIED 2026 (March 2026)
Authors: Zhaohui Li, Peng He, Honglu Liu, Zeyuan Wang, Zhiyuan Chen, Tingting Li, Jinjun Xiong
As educators increasingly use generative AI to create science materials, manual evaluation remains time-consuming and difficult to scale. We introduce SciEval, a specialized benchmark dataset of 273 lesson-level materials evaluated across 13 criteria. Our research demonstrates that domain-aligned fine-tuning of LLMs (like Qwen3) can achieve up to 11% performance gains in automated pedagogical evaluation.
DrawSim-PD: Simulating Student Science Drawings to Support NGSS-Aligned Teacher Diagnostic Reasoning
Authors: Arijit Chakma, Peng He, Honglu Liu, Zeyuan Wang, Tingting Li, Tiffany D. Do, Feng Liu
Paper: Read on arXiv
Privacy regulations often prohibit sharing authentic student work for teacher professional development at scale. To address this, we present DrawSim-PD, the first generative framework that simulates NGSS-aligned, student-like science drawings exhibiting controllable pedagogical imperfections. Driven by structured “capability profiles,” the system ensures cross-modal coherence across a student drawing, a reasoning narrative, and a teacher-facing diagnostic concept map. Evaluated positively by K-12 educators, we release a corpus of 10,000 structured artifacts to overcome data scarcity in visual assessment research.