A likelihood-based approach to identifying contaminated food products using sales data: performance and challenges.


Foodborne disease outbreaks of recent years demonstrate that due to increasingly interconnected supply chains these type of crisis situations have the potential to affect thousands of people, leading to significant healthcare costs, loss of revenue for food companies, and--in the worst cases--death. When a disease outbreak is detected, identifying the contaminated food quickly is vital to minimize suffering and limit economic losses. Here we present a likelihood-based approach that has the potential to accelerate the time needed to identify possibly contaminated food products, which is based on exploitation of food products sales data and the distribution of foodborne illness case reports. Using a real world food sales data set and artificially generated outbreak scenarios, we show that this method performs very well for contamination scenarios originating from a single "guilty" food product. As it is neither always possible nor necessary to identify the single offending product, the method has been extended such that it can be used as a binary classifier. With this extension it is possible to generate a set of potentially "guilty" products that contains the real outbreak source with very high accuracy. Furthermore we explore the patterns of food distributions that lead to "hard-to-identify" foods, the possibility of identifying these food groups a priori, and the extent to which the likelihood-based method can be used to quantify uncertainty. We find that high spatial correlation of sales data between products may be a useful indicator for "hard-to-identify" products.

MIDAS Network Members