Fengzhu Sun






Ning K, Zhao L, Matloff W, Sun F, Toga AW. (2020). Association of relative brain age with tobacco smoking, alcohol consumption, and genetic variants. Scientific reports, 10(1)

Wang W, Ren J, Tang K, Dart E, Ignacio-Espinoza JC, Fuhrman JA, Braun J, Sun F, Ahlgren NA. (2020). A network-based integrated framework for predicting virus-prokaryote interactions. NAR genomics and bioinformatics, 2(2)

Song K, Ren J, Sun F. (2019). Reads Binning Improves Alignment-Free Metagenome Comparison. Frontiers in genetics, (10)

Zhu Z, Ren J, Michail S, Sun F. (2019). MicroPro: using metagenomic unmapped reads to provide insights into human microbiota and disease associations. Genome biology, 20(1)

Tang K, Ren J, Sun F. (2019). Afann: bias adjustment for alignment-free sequence comparison based on sequencing data using neural network regression. Genome biology, 20(1)

Nusbaum DJ, Sun F, Ren J, Zhu Z, Ramsy N, Pervolarakis N, Kunde S, England W, Gao B, Fiehn O, Michail S, Whiteson K. (2018). Gut microbial and metabolomic profiles after fecal microbiota transplantation in pediatric ulcerative colitis patients. FEMS microbiology ecology, 94(9)

Tang K, Lu YY, Sun F. (2018). Background Adjusted Alignment-Free Dissimilarity Measures Improve the Detection of Horizontal Gene Transfer. Frontiers in microbiology, (9)

Tang K, Ren J, Cronn R, Erickson DL, Milligan BG, Parker-Forney M, Spouge JL, Sun F. (2018). Alignment-free genome comparison enables accurate geographic sourcing of white oak DNA. BMC Genomics, 19(1)

Li H, Sun F. (2018). Comparative studies of alignment, alignment-free and SVM based approaches for predicting the hosts of viruses based on viral sequences. Scientific reports, 8(1)

Wang Y, Fu L, Ren J, Yu Z, Chen T, Sun F. (2018). -mer Sequence Signatures. Frontiers in microbiology, (9)

Bai X, Jia JA, Fang M, Chen S, Liang X, Zhu S, Zhang S, Feng J, Sun F, Gao C. (2018). Deep sequencing of HBV pre-S region reveals high heterogeneity of HBV genotypes and associations of word pattern frequencies with HCC. PLoS genetics, 14(2)

Ren J, Bai X, Lu YY, Tang K, Wang Y, Reinert G, Sun F. (2018). Alignment-Free Sequence Analysis and Applications. Annual review of biomedical data science, (1)

Wang Y, Wang K, Lu YY, Sun F. (2017). Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity. BMC Bioinformatics, 18(1)

Zhang M, Yang L, Ren J, Ahlgren NA, Fuhrman JA, Sun F. (2017). Prediction of virus-host infectious association by supervised learning methods. BMC Bioinformatics, 18(Suppl 3)

Lu YY, Lv J, Fuhrman JA, Sun F. (2017). Towards enhanced and interpretable clustering/classification in integrative genomics. Nucleic acids research, 45(20)

Lu YY, Tang K, Ren J, Fuhrman JA, Waterman MS, Sun F. (2017). CAFE: aCcelerated Alignment-FrEe sequence analysis. Nucleic acids research, 45(W1)

Bai X, Tang K, Ren J, Waterman M, Sun F. (2017). Optimal choice of word length when comparing two Markov sequences using a chi-square statistic. BMC Genomics, 18(Suppl 6)

Liao W, Ren J, Wang K, Wang S, Zeng F, Wang Y, Sun F. (2016). Alignment-free Transcriptomic and Metatranscriptomic Comparison Using Sequencing Signatures with Variable Length Markov Chains. Scientific reports, (6)

Zhang W, Coba MP, Sun F. (2016). Inference of domain-disease associations from domain-protein, protein-disease and disease-disease relationships. BMC systems biology, (10 Suppl 1)

Lu YY, Chen T, Fuhrman JA, Sun F. (2017). COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge. Bioinformatics (Oxford, England), 33(6)

Ahlgren NA, Ren J, Lu YY, Fuhrman JA, Sun F. (2017). Alignment-free $d_2^*$ oligonucleotide frequency dissimilarity measure improves prediction of hosts from metagenomically-derived viral sequences. Nucleic acids research, 45(1)

Xia LC, Ai D, Cram JA, Liang X, Fuhrman JA, Sun F. (2015). Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains. BMC Bioinformatics, (16)

Wang W, Zhou X, Liu Z, Sun F. (2015). Network tuned multiple rank aggregation and applications to gene ranking. BMC Bioinformatics, (16 Suppl 1)

Chen Q, Zhou XJ, Sun F. (2015). Finding genetic overlaps among diseases based on ranked gene lists. Journal of computational biology : a journal of computational molecular cell biology, 22(2)

Wang Y, Liu L, Chen L, Chen T, Sun F. (2014). Comparison of metatranscriptomic samples based on k-tuple frequencies. PLoS One, 9(1)

Song K, Ren J, Reinert G, Deng M, Waterman MS, Sun F. (2014). New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing. Briefings in bioinformatics, 15(3)

Ma X, Chen T, Sun F. (2014). Integrative approaches for predicting protein function and prioritizing genes for complex phenotypes using protein interaction networks. Briefings in bioinformatics, 15(5)

Chen Q, Sun F. (2013). A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms. BMC Genomics, (14 Suppl 1)

Sun F. (2013). Research in Computational Molecular Biology (RECOMB 2013). Journal of computational biology : a journal of computational molecular cell biology, 20(10)

Song K, Ren J, Zhai Z, Liu X, Deng M, Sun F. (2013). Alignment-free sequence comparison based on next-generation sequencing reads. Journal of computational biology : a journal of computational molecular cell biology, 20(2)

Wan L, Sun F. (2012). CEDER: accurate detection of differentially expressed genes by combining significance of exons using RNA-Seq. IEEE/ACM transactions on computational biology and bioinformatics, 9(5)

Xia LC, Ai D, Cram J, Fuhrman JA, Sun F. (2013). Efficient statistical significance approximation for local similarity analysis of high-throughput time series data. Bioinformatics (Oxford, England), 29(2)

Wan L, Yan X, Chen T, Sun F. (2012). Modeling RNA degradation for RNA-Seq with applications. Biostatistics (Oxford, England), 13(4)

Zhai Z, Reinert G, Song K, Waterman MS, Luan Y, Sun F. (2012). Normal and compound poisson approximations for pattern occurrences in NGS reads. Journal of computational biology : a journal of computational molecular cell biology, 19(6)

Chang Q, Luan Y, Chen T, Fuhrman JA, Sun F. (2012). Computational methods for the analysis of tag sequences in metagenomics studies. Frontiers in bioscience (Scholar edition), (4)

Chang Q, Luan Y, Sun F. (2011). Variance adjusted weighted UniFrac: a powerful beta diversity measure for comparing communities based on phylogeny. BMC Bioinformatics, (12)

Steele JA, Countway PD, Xia L, Vigil PD, Beman JM, Kim DY, Chow CE, Sachdeva R, Jones AC, Schwalbach MS, Rose JM, Hewson I, Patel A, Sun F, Caron DA, Fuhrman JA. (2011). Marine bacterial, archaeal and protistan association networks reveal ecological linkages. The ISME journal, 5(9)

Xia LC, Steele JA, Cram JA, Cardon ZG, Simmons SL, Vallino JJ, Fuhrman JA, Sun F. (2011). Extended local similarity analysis (eLSA) of microbial community and other time series data with replicates. BMC systems biology, (5 Suppl 2)

Xia LC, Cram JA, Chen T, Fuhrman JA, Sun F. (2011). Accurate genome relative abundance estimation based on shotgun metagenomic reads. PLoS One, 6(12)

Liu X, Wan L, Li J, Reinert G, Waterman MS, Sun F. (2011). New powerful statistics for alignment-free sequence comparison under a pattern transfer model. Journal of theoretical biology, 284(1)

Zhai Z, Ku SY, Luan Y, Reinert G, Waterman MS, Sun F. (2010). The power of detecting enriched patterns: an HMM approach. Journal of computational biology : a journal of computational molecular cell biology, 17(4)

Zhou L, Ma X, Arbeitman MN, Sun F. (2009). Chromatin regulation and gene centrality are essential for controlling fitness pleiotropy in yeast. PLoS One, 4(11)

Wang L, Tu Z, Sun F. (2009). A network-based integrative approach to prioritize reliable hits from multiple genome-wide RNAi screens in Drosophila. BMC Genomics, (10)

Wang W, Nunez-Iglesias J, Luan Y, Sun F. (2009). Usefulness and limitations of dK random graph models to predict interactions and functional homogeneity in biological networks under a pseudo-likelihood parameter estimation approach. BMC Bioinformatics, (10)

Zhou L, Ma X, Sun F. (2008). The effects of protein interactions, gene essentiality and regulatory regions on expression variation. BMC systems biology, (2)

Yan X, Sun F. (2008). Testing gene set enrichment for subset of genes: Sub-GSE. BMC Bioinformatics, (9)

Ruan Q, Steele JA, Schwalbach MS, Fuhrman JA, Sun F. (2006). A dynamic programming algorithm for binning microbial community profiles. Bioinformatics (Oxford, England), 22(12)

Tu Z, Wang L, Xu M, Zhou X, Chen T, Sun F. (2006). Further understanding human disease genes by comparing with housekeeping genes and other genes. BMC Genomics, (7)

Tu Z, Wang L, Arbeitman MN, Chen T, Sun F. (2006). An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics (Oxford, England), 22(14)

Jiang R, Tu Z, Chen T, Sun F. (2006). Network motif identification in stochastic networks. Proceedings of the National Academy of Sciences of the United States of America, 103(25)

Ma X, Lee H, Wang L, Sun F. (2007). CGI: a new approach for prioritizing genes by combining gene expression and protein-protein interaction data. Bioinformatics (Oxford, England), 23(2)

Ruan Q, Dutta D, Schwalbach MS, Steele JA, Fuhrman JA, Sun F. (2006). Local similarity analysis reveals unique associations among marine bacterioplankton species and environmental factors. Bioinformatics (Oxford, England), 22(20)

Zhang K, Sun F. (2005). Assessing the power of tag SNPs in the mapping of quantitative trait loci (QTL) with extremal and random samples. BMC genetics, (6)

Zhang K, Qin Z, Chen T, Liu JS, Waterman MS, Sun F. (2005). HapBlock: haplotype block partitioning and tag SNP selection software using a set of dynamic programming algorithms. Bioinformatics (Oxford, England), 21(1)

Lai Y, Sun F. (2004). Sampling distribution for microsatellites amplified by PCR: mean field approximation and its applications to genotyping. Journal of theoretical biology, 228(2)

Zhang K, Qin ZS, Liu JS, Chen T, Waterman MS, Sun F. (2004). Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies. Genome research, 14(5)

Deng M, Chen T, Sun F. (2004). An integrated probabilistic model for functional prediction of proteins. Journal of computational biology : a journal of computational molecular cell biology, 11(2-3)

Kim S, Zhang K, Sun F. (2003). Detecting susceptibility genes in case-control studies using set association. BMC genetics, (4 Suppl 1)

Lai Y, Sun F. (2003). Microsatellite mutations during the polymerase chain reaction: mean field approximations and their applications. Journal of theoretical biology, 224(1)

Sun F, Cui J, Gavras H, Schwartz F. (2003). A novel class of tests for the detection of mitochondrial DNA-mutation involvement in diseases. American journal of human genetics, 72(6)

Lai Y, Sun F. (2003). The relationship between microsatellite slippage mutation rate and the number of repeat units. Molecular biology and evolution, 20(12)

Lai Y, Shinde D, Arnheim N, Sun F. (2003). The mutation process of microsatellites during the polymerase chain reaction. Journal of computational biology : a journal of computational molecular cell biology, 10(2)

Deng M, Zhang K, Mehta S, Chen T, Sun F. (2003). Prediction of protein function using protein-protein interaction data. Journal of computational biology : a journal of computational molecular cell biology, 10(6)

Zhang K, Deng M, Chen T, Waterman MS, Sun F. (2002). A dynamic programming algorithm for haplotype block partitioning. Proceedings of the National Academy of Sciences of the United States of America, 99(11)

Zhang K, Calabrese P, Nordborg M, Sun F. (2002). Haplotype block structure and its applications to association studies: power and study designs. American journal of human genetics, 71(6)

If you’d like to update your profile, please email modifications to

This site is registered on as a development site.