Differential Diagnosis of Dengue and Chikungunya in Colombian Children Using Machine Learning


Dengue and chikungunya are vector borne diseases endemic in tropical countries around the world, with very similar clinical presentation, which makes it hard for physicians to tell them apart. Here we propose the use of Machine Learning based classifiers to perform differential diagnosis of dengue and chikungunya in pediatric patients, using simple blood test results as predictors instead of symptoms. Three variables (platelet count, white cell count and hematocrit percentage) from 447 pediatric patients from Hospital Infantil Napoleón Franco Pareja were collected to construct a dataset, later partitioned into train and test sets using Stratified Random Sampling. Grid Search with Stratified 5-Fold Cross-Validation was conducted to assess the performance of Logistic Regression, Support Vector Machine, and CART Decision Tree classifiers. Cross-Validation results show a L2 Logistic Regression model with second degree polynomial features outperforming the other models considered, with a cross-validated Receiver Operating Characteristic Area Under the Curve (ROC AUC) score of 0.8694. Subsequent results over the test set showed a 0.8502 ROC AUC score. Despite a reduced sample and a heavily imbalanced data set, ROC AUC score results are promising and support our approach for dengue and chikungunya differential diagnosis.

MIDAS Network Members