Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks.


With the rapid development of biotechnologies, many types of biological data including molecular networks are now available. However, to obtain a more complete understanding of a biological system, the integration of molecular networks with other data, such as molecular sequences, protein domains and gene expression profiles, is needed. A key to the use of networks in biological studies is the definition of similarity among proteins over the networks. Here, we review applications of similarity measures over networks with a special focus on the following four problems: (i) predicting protein functions, (ii) prioritizing genes related to a phenotype given a set of seed genes that have been shown to be related to the phenotype, (iii) prioritizing genes related to a phenotype by integrating gene expression profiles and networks and (iv) identification of false positives and false negatives from RNAi experiments. Diffusion kernels are demonstrated to give superior performance in all these tasks, leading to the suggestion that diffusion kernels should be the primary choice for a network similarity metric over other similarity measures such as direct neighbors and shortest path distance.

MIDAS Network Members