Predicting Risk of Sport-Related Concussion in Collegiate Athletes and Military Cadets: A Machine Learning Approach Using Baseline Data from the CARE Consortium Study

Sports Med -

Castellanos, J., Phoo, C. P., Eckner, J. T., Franco, L., Broglio, S. P., McCrea, M., . . . Investigators, C. C..



OBJECTIVE: To develop a predictive model for sport-related concussion in collegiate athletes and military service academy cadets using baseline data collecting during the pre-participation examination. METHODS: Baseline assessments were performed in 15,682 participants from 21 US academic institutions and military service academies participating in the CARE Consortium Study during the 2015-2016 academic year. Participants were monitored for sport-related concussion during the subsequent season. 176 baseline covariates mapped to 957 binary features were used as input into a support vector machine model with the goal of learning to stratify participants according to their risk for sport-related concussion. Performance was evaluated in terms of area under the receiver operating characteristic curve (AUROC) on a held-out test set. Model inputs significantly associated with either increased or decreased risk were identified. RESULTS: 595 participants (3.79%) sustained a concussion during the study period. The predictive model achieved an AUROC of 0.73 (95% confidence interval 0.70-0.76), with variable performance across sports. Features with significant positive and negative associations with subsequent sport-related concussion were identified. CONCLUSION(S): This predictive model using only baseline data identified athletes and cadets who would go on to sustain sport-related concussion with comparable accuracy to many existing concussion assessment tools for identifying concussion. Furthermore, this study provides insight into potential concussion risk and protective factors.

Links to full article: