There is an urgent need to support research that generates high-quality evidence to inform clinical decision making. Cluster randomized trials (CRTs) achieve the highest standard of evidence for the evaluation of community-level effectiveness of intervention strategies against infectious diseases. However, there is a need to develop new methods to improve the design and analysis of CRTs because unique and complicated analytical challenges arise in such settings. One such issue relates to the intraclass correlation coefficient (ICC), the degree to which individuals within a community are more similar to one another than to individuals in other communities. Design and analysis of CRTs must take into account the ICC. Lack of accurate information on the ICC jeopardizes the power of CRTs, leads to suboptimal choices of analysis methods and complicates the interpretation of study results. However, reliable information on the ICC is difficult to obtain. A robust and efficient approach for estimating ICCs is based on the second-order generalizing estimating equations. However, its use has been limited by considerable computational burden and poor convergence rates associated with the existing algorithms solving these equations. The first aim addresses these computational challenges. Missing data are ubiquitous and can lead to bias and loss of efficiency. The second aim proposes to develop novel robust and efficient methods for estimating ICCs in the presence of informative missing data. For infectious diseases, the underlying contact/transmission networks give rise to complicated correlation structure. The third aim is to develop network and epidemic models to project the ICC. User-friendly software will be developed to facilitate the implementation of new methods. An immediate application of the proposed methods is their application to the Botswana Combination Prevention Project to improve the estimation of intervention effect and to generate reliable ICC estimates for designing future CRTs in the same population. The proposed methods can be applied to other ongoing and future CRTs, and more broadly, to longitudinal studies and agreement studies where ICCs are also of great interest. The proposed research is significant, because success in addressing these issues will improve the ability to design efficient and well-powered CRTs and the precision in estimating the effects of intervention strategies. Innovation lies in the development of improved computing algorithms adapting approaches from deep learning, the use of semiparametric efficiency theory, and the integration of network modeling, epidemic modeling and statistical inference. The results of the proposed research will benefit both ongoing and future CRTs, permit more efficient use of the resources, and ultimately expedite the control of infectious diseases.