Senior Group Leader in Pathogen Dynamics
University of Oxford
Participants with estimated duration of HIV-1 infection based on repeated testing were sourced from cohorts in Botswana (n=1944). Full-length HIV genome sequencing was performed from proviral DNA. We optimized a machine learning model to classify infections as 1 year based on viral genetic diversity, demographic and clinical data.
These results indicate that recency of HIV-1 infection can be inferred from viral sequence diversity even among patients on suppressive ART.
HIV-1 genetic diversity increases during infection and can help infer the time elapsed since infection. However the effect of antiretroviral treatment (ART) on the inference remains unknown.
The best predictive model included variables for genetic diversity of HIV-1 gag, pol and env, viral load, age, sex and ART status. Most participants were on ART. Balanced accuracy was 90.6% (95%CI:86.7%-94.1%). We tested the algorithm among newly diagnosed participants with or without documented negative HIV tests. Among those without records, those who self-reported a negative HIV test within 1 year previously. There was no difference in classification between those self-reporting a negative HIV test <1 year, whether or not they had a record.