BIGDATA: Collaborative Research: F: Efficient Distributed Computation of Large-Scale Graph Problems in Epidemiology and Contagion Dynamics


A number of phenomena of societal importance, such as the spread of diseases and contagion processes, can be modeled by stochastic processes on networks. The analysis and control of such network phenomena involve, at their heart, fundamental graph-theoretic problems. The graphs encountered are typically of large-scale (having tens of millions of nodes); further, typical experimental analyses involve large designs with a number of parameters, leading to hundreds of thousands of graph computations. Novel methods for solving these problems are needed, since fast response times are critical to effective decision making. The overarching goal of this project is to develop efficient distributed algorithms and associated lower bounds for graph-theoretic problems that arise in computational epidemiology and contagion dynamics. This will have a significant impact on these specific applications, through more efficient algorithmic tools for enabling complex analyses. The project will also make fundamental contributions to the design and analysis of distributed algorithms for graph problems in large-scale networks, and will result in an algorithmic toolkit with building blocks for performing large-scale distributed graph computation. The project will lead to significant curriculum development for undergraduate as well as graduate students, as well as public health analysts. Finally, the project will help in involving minority and underrepresented students in research. The technical focus of the project will be on distributed algorithms for fundamental topics in graph algorithms such as graph connectivity, distances, subgraph analysis, and different kinds of centrality measures. These topics underlie some of the recurring problems in the modeling, simulation and analysis and control of different kinds of contagion processes. For all these problems, the project will focus on developing provably efficient distributed algorithms and showing lower bounds under a message-passing distributed computing model. The PIs will also develop efficient implementations of these algorithms, and evaluate their performance and solution quality in real-world graphs arising in epidemiology. The graphs that arise in these applications have several novel characteristics, which will present new challenges as well as opportunities for distributed computing.


Funding Source

National Science Foundation

Project Period