Exploring Attributes of Successful Machine Learning Assessments for Scoring of Undergraduate Constructed Response Assessment Items

Abstract

Content-based computer scoring models (CSMs) have successfully automated scoring of constructed response assessments, thereby increasing their use in multiple educational settings. However, the creation of CSMs remains time-intensive as little is known about model, item, and training set features that expedite development. Herein, we examined a large set of holistic CSMs for text classification to determine the relationship between scoring accuracy and different assessment item, CSM, and training set features. We found the number of rubric and CSM bins, item question structure, and item context significantly influenced CSM accuracy. By applying novel text diversity metrics, we found most text diversity metrics did not correlate to CSM accuracy. However, fewer shared words across responses correlated to increased CSM accuracy overall and within individual bins. Finally, we applied ordination technique to visualize constructed response corpora based on shared language among responses. We found these techniques aided decision-making during CSM development.

Author

Megan Shiroda, Jennifer Doherty, Kevin Haudek, Xiaoming Zhai, Joseph Krajcik

Year of Publication

2024

Book Title

Uses of Artificial Intelligence in STEM Education

Edition

Chapter

Publisher

Oxford University Press

ISBN Number

978-0-19-888207-7

URL

https://doi.org/10.1093/oso/9780198882077.003.0007

DOI

10.1093/oso/9780198882077.003.0007