Close

COMPUTATIONAL ANALYSIS OF METAGENOMIC SEQUENCING DATA

Abstract

Modern high-throughput sequencing techniques are capable of sequencing individual or mixtures of genomes and transcriptomes in natural systems efficiently and at a low cost. The speed and the low cost of these new sequencing technologies provide enormous opportunities for immediate biological threat detection and for studying the changes of the biological communities that may be affected by the threat. The investigators propose to develop novel mathematical, statistical and computational tools to address general questions arisen from many biological applications: Which organisms are present in an environmental sample? What are the genetic variants of the organisms present? How do the organisms change in the affected community? How do the organisms and biological pathways associate with each other to affect the community functions? More specifically, this application proposes to (1) develop better base-calling and SNP-calling methods to improve the analysis of genome sequencing data, (2) design fast and accurate clustering methods to classify microbial organisms into "operational taxonomic units" (OTUs) and to characterize their interactions with each other within communities and with environmental conditions, (3) develop novel methods to associate biological pathways with OTUs, and (4) implement these methods into user-friendly software tools that are available for the research community. The investigators propose to develop novel computational tools for accurate detection of genome sequences of microbial organisms in environmental samples, and for understanding the diversity, and the temporal and spatial patterns of microbial communities. The tools developed in this proposal will be extremely important for biological threat detection, as well as for microbial ecological studies in various environments including soil, water, and different human tissues. Through this proposal, graduate students will be trained in the field of computational systems ecology, and a suite of computer algorithms and software tools will be disseminated through the web for the whole research community.

People

Funding Source

Project Period

2011-2016