This initiative
We address this measurement challenge through a text as data approach to classify college courses into a standardized hierarchical taxonomy developed by the National Center for Education Statistics, called the College Course Map. Drawing on nationally-representative transcript studies, we fine-tune and release open-source RoBERTa models placing courses into this taxonomy. Our training data, methods, and performance are described here. The models will be the engine behind a new software tool that practitioners and researchers can use to classify their own course data
into this standardized scheme.
The benefits
Our research team is using the tool for several applications, including to identify "curricular deserts," bottleneck courses, and to describe long-term trends in course-taking.
Use the tool
We are currently looking for researchers and practitioners that would be interested in piloting an early release version of the CCM software tool. See more information on our Getting Involved page.
Acknowledgements
We are grateful for support from R305D240029 from the Institute of Education Sciences, U.S. Department of Education. Annaliese is grateful for support from PR/Award R305B200011 from the Institute of Education Sciences, U.S. Department of Education, and from a 2024 National Academy of Education/Spencer Foundation Dissertation Fellowship. We are grateful for a data partnership with the Education Research Center at UT-Dallas that makes this research possible and for the helpful guidance of the staff at UT-Dallas. The conclusions of this research do not necessarily reflect the opinions or official position of the Texas Education Agency, the Texas Higher Education Coordinating Board, the Texas Workforce Commission, the State of Texas, the Institute of Education Sciences, or the U.S. Department of Education.
To come. For more information, email Kevin Stange.