Development of a Next Generation Concept Inventory with AI-based Evaluation for College Environmental Programs
Abstract
Interdisciplinary environmental programs (EPs) are increasingly popular in U.S. universities, but lack consensus about core concepts and assessments aligned to these concepts. To address this, we are developing assessments for evaluating undergraduates’ foundational knowledge in EPs and their ability to use complex systems-level concepts within the context of the Food-Energy-Water (FEW) Nexus. Specifically, we have applied a framework for developing and evaluating constructed response (CR) questions in science to create a Next Generation Concept Inventory in EPs, along with machine learning (ML) text scoring models.
Building from previous research, we identified four key activities for assessment prompts: Explaining connections between FEW, Identifying sources of FEW, Cause & Effect of FEW usage and Tradeoffs. We developed three sets of CR items to these four activities using different phenomena as contexts. To pilot our initial items, we collected responses from over 700 EP undergraduates across seven institutions to begin coding rubric development. We developed a series of analytic coding rubrics to identify students’ scientific and informal ideas in CRs and how students connect scientific ideas related to FEW. Human raters have demonstrated moderate to good levels of agreement on CRs (0.72-0.85) using these rubrics. We have used a small set of coded responses to begin development of supervised ML text classification models. Overall, these ML models have acceptable accuracy (M= .89, SD= .08) but exhibit a wide range of other model metrics. This underscores the challenges of using ML based evaluation for complex and interdisciplinary assessments.