College Course Map: Training Data, Methods, and Performance

Sed at bibendum nibh

Training Data, Methods, and Performance

Sed at bibendum nibh. Suspendisse laoreet quam non dui feugiat, nec venenatis purus bibendum. Donec auctor mattis nulla a laoreet. Quisque ac libero fermentum, tempus risus et, scelerisque diam. Proin semper, elit vitae semper viverra, eros ligula euismod odio, at ornare turpis eros ut enim. Ut tincidunt, diam non luctus fermentum, erat est euismod orci, ac consectetur tellus urna a elit. Nullam hendrerit a quam at blandit. Vivamus suscipit nec justo pharetra vulputate. Nullam nulla felis, dapibus eu consequat sed, ornare non nisi. Suspendisse posuere nisl a dolor viverra vehicula. Interdum et malesuada fames ac ante ipsum primis in faucibus.

Training Data, Methods and Performance

We draw on four waves of the Postsecondary Education Transcript Studies (PETS) from the High School Longitudinal Study of 2009, the Baccalaureate and Beyond Longitudinal Study of 2008-2012, and the Beginning Postsecondary Students Longitudinal Study of 2004-2009, and 2012-2017. In PETS, student administrative transcripts are collected and standardized through human annotation into NCES's 2010 College Course Map (CCM) codes. Each course is assigned a hierarchical six-digit CCM code. Two-digit codes represent broad disciplines (i.e. "45" represents social science), four-digit codes represent specific disciplines ("45.06" represents economics), and six-digit codes represent specific courses ("45.0603" represents econometrics and quantitative economics). Our PETS dataset contains 891,320 courses representing 2,617,550 course enrollments from 2002 to 2018 and contains 48 two-, 373 four- , and 1622 six-digit codes.

We randomly split our data into a ninety-percent training and a ten-percent test set, stratifying on six-digit CCM code. We fine-tune three RoBERTa models using cross-entropy loss to classify courses into CCM codes sing subject codes, catalog numbers, and course titles. For three epochs.
We evaluate our models on the test set on four metrics: accuracy, macro precision, macro recall, and macro F1. In addition, we derive higher level codes from lower levels codes and test performance. For instance, if our six-digit model predicted "45.0603" - econometrics and quantitative economics - we derive the two-digit code "45" - social science - then test whether the six-digit model predicts the appropriate two-digit code.

In Table 1, we present our model evaluation metrics on the unseen test data. In Panel A, we present model performance by course and, in Panel B, we present performance weighted by enrollments. We see strong performance on two-digit codes, correctly labeling ninety percent of enrollments. Given that the six-digit CCM taxonomy represents a challenging task of classifying courses into one of 1,622 categories, we also show relatively strong performance on six-digit codes.

Strong model performance is insufficient to establish construct validity. To test criterion validity, we draw on pairs of transfer equivalent courses. Assuming courses that transfer for credit have similar content, we anticipate classifying pairs of courses similarly. A valid measure would typically classify both courses into the same or similar codes.
We collect 983,606 unique pairs of transfer equivalent courses from 10 systems: CUNY, Maryland, Massachusetts, Michigan, Minnesota, North Carolina, Pennsylvania, Tennessee, Utah, and Virginia. We make predictions using the two-, four-, and six-digit models on both courses in a pair, then examine how often these labels correspond.

Training Data, Methods, and Performance

Training Data, Methods and Performance

Data

Methods

Model Evaluation

Measure Validation