Scoring colorectal cancer risk with an artificial neural network based on self-reportable personal health data.


Colorectal cancer (CRC) is third in prevalence and mortality among all cancers in the US. Currently, the United States Preventative Services Task Force (USPSTF) recommends anyone ages 50-75 and/or with a family history to be screened for CRC. To improve screening specificity and sensitivity, we have built an artificial neural network (ANN) trained on 12 to 14 categories of personal health data from the National Health Interview Survey (NHIS). Years 1997-2016 of the NHIS contain 583,770 respondents who had never received a diagnosis of any cancer and 1409 who had received a diagnosis of CRC within 4 years of taking the survey. The trained ANN has sensitivity of 0.57 ± 0.03, specificity of 0.89 ± 0.02, positive predictive value of 0.0075 ± 0.0003, negative predictive value of 0.999 ± 0.001, and concordance of 0.80 ± 0.05 per the guidelines of Transparent Reporting of Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) level 2a, comparable to current risk-scoring methods. To demonstrate clinical applicability, both USPSTF guidelines and the trained ANN are used to stratify respondents to the 2017 NHIS into low-, medium- and high-risk categories (TRIPOD levels 4 and 2b, respectively). The number of CRC respondents misclassified as low risk is decreased from 35% by screening guidelines to 5% by ANN (in 60 cases). The number of non-CRC respondents misclassified as high risk is decreased from 53% by screening guidelines to 6% by ANN (in 25,457 cases). Our results demonstrate a robustly-tested method of stratifying CRC risk that is non-invasive, cost-effective, and easy to implement publicly.

MIDAS Network Members